Monday, December 27, 2010

A Christmas present from the monks

After what seems an age I finally got promotion to Prior, next stop (in 3000 XP points time) is Monsignor. My best node was Unexpected Output with a score of (strangely) 42.

It's weird, I get such a buzz from helping people with Perl it is almost adictive.

Monday, November 01, 2010

Python msvcrt.locking trials and tribulations

It should be easy. All I wanted to do is to replace my C demos which use flock with a nice homely Python version on Windows. I wanted to stick with the standard library so I could just slot it into the existsing course (so I couldn't use the Locking module on PyPi).

A simple demo to start with:
1. Process 1: lock a region, write a record, pause
2. Process 2: attempt to lock the same region
3. Allow Process 1 to release the lock
4. Process 2 continues

Converting the code was easy (I thought), but no matter what I did I always found the region was locked for the second process - the lock was not being released.

I blame the documentation, and possibly the implementation.

What is not obvious is that the current file position must be reset to the original locking position before the lock gets released. The write (of course) advances the current file position, so by the time we do the unlock we are unlocking a region for which we don't have the lock in the first place! Wouldn't it be better if msvcrt.locking returned a lock object, saving the file position? Anyway, here is my completed demo:


"""
lock_w.py

Run two copies, each in its own terminal session.
Allow one to write a number of records, then switch
to the other showing that it blocks on the same record.
Switch back and release the lock, and show that the
blocked process proceeds.

Also run with lock_r to demonstrate interaction with
read locks.

Default filename is rlock.dat

Clive Darke QA
"""

import msvcrt
import os
import sys
import time
from datetime import datetime

REC_LIM = 20

if len(sys.argv) < 2:
pFilename = "rlock.dat"
else:
pFilename = sys.argv[1]

fh = open(pFilename, "w")

# Do only once
Record = dict()
Record['Pid'] = os.getpid()

for i in range(REC_LIM):
Record['Text'] = "Record number %02d" % i
Record['Time'] = datetime.now().strftime("%H:%M:%S")

# Get the size of the area to be locked
line = str(Record) + "\n"

# Get the current start position
start_pos = fh.tell()

print "Getting lock for",Record['Text']
msvcrt.locking(fh.fileno(), msvcrt.LK_RLCK, len(line)+1)
fh.write(line) # This advances the current position

raw_input("Written record %02d, clear the lock?" % i)

# Save the end position
end_pos = fh.tell()

# Reset the current position before releasing the lock
fh.seek(start_pos)
msvcrt.locking(fh.fileno(), msvcrt.LK_UNLCK, len(line)+1)

# Go back to the end of the written record
fh.seek(end_pos)

print

fh.close()

Wednesday, October 13, 2010

Update

I try to keep these posts technical, but I have not blogged in a while so I'll break with tradition.

I have been teaching a lot of Python. Mostly this has been Python 2, but some clients want Py3. Unfortunately not everyone understands how Open Source language releases work, and some internally assumed that no-one would want Python 2 training as soon as Python 3 was released.

I have to confess that, having done a lot of Python recently, moving back to Perl is rather a drudge. Python is not perfect, but I now find the syntax of Perl unnecessarily fussy, and Perl 6 is even worse. I mean fussy in the sense that Perl code is to Python as a doily is to a beer-mat.

Talking of Perl 6, with the release of Rakudo Star I thought (hoped) that we would get a flood of requests for courses, but we have not had one. Rakudo Star is, I guess, still too early, and I suppose early adopters are happy to learn it themselves.

Meanwhile, roughly half of the delegates that I am teaching Python have come from Perl. There does not appear to be a consistent reason for this, and often it is not the practitioner's decision anyway. It appears to be the perception of where Perl is in the scheme of things. Its all about image and marketting. What can Perl do about that? Perl 6, but Python is catching up.

Here is a simple example. One of the nice things about Perl is the way that lists can be used:

($one, $two, @fred) = qw(The quick brown fox);

Cannot do that in Python 2, but in Python 3:

one, two, *fred = ('The', 'quick', 'brown', 'fox')

The * indicates a greedy list (and you though Python didn't have sigils?). No qw() equivalent yet though, and Perl array and hash slices are still more powerful. Unless you know better...

Now I learn that Civilisation V is using Lua as its scripting language because Python (used in Civ. IV) is too slow. Is Lua next on my list?

Thursday, August 05, 2010

Sony Vaio Windows 7 "No Internet access"

This error has been driving me crazy. I got it when trying to connect to certain WiFi's, but not all. Surfing the web gave various solutions, none of which worked. I eventually cracked it by accident, so I'm posting it in the hope I might prevent someone else from going insane. It is far too late for me.

Sony bundles all sorts of clever programs on its laptops. It's mostly concerned with multi-media, but one of them concerns us - VAIO Control Center (sic). So go to start/"All Programs" and select it. Now select "Network Connections", then "VAIO Smart Network".
Here be Dragons.

VAIO Smart Network creates "profiles". Click on "Advanced", recognise that stupid dialogue displayed at start-up? Click "Settings" (bet you never thought of going there).
Now select "Profile Settings" from the left panel. Yes, I know this navigation is tortuous, but we are nearly there. From here you can edit one of your profiles, or create a new one. When a new profile is created from the desktop dialogue when it first connects, it defaults everything to the previous settings. It was carrying over a DNS setting I had from another connection, and that was preventing me from connecting. Select the "IP and DNS" tab and ensure that "Obtain an IP address automatically" and "Obtain DNS server address automatically" are both selected (it was the later which screwed me).

The profile is selected by the name of the network. One thing that Sony engineers did not take into account in their design is a name collision. That is, two distinctly different networks having the same name. Many of our training centres have WiFi's, and they all are named the same, however they have different DNS addresses. That allowed me to connect to one training centre successfully, but then fail at all the others because the profile remembered the DNS address from the previous site. Going through the Microsoft Windows network settings did not detect where this was being held, because Sony were causing the problem not Microsoft (for once).

Hope this helps.

Sunday, July 25, 2010

A view from EuroPython 2010

I have just returned from EuroPython which was, like last year, in Birmingham, UK. This post will not go into too much technical detail, email if you need more. I'm guessing that the reader does not want to know how Python implements IEEE 754 floating point format (that's even mentioned in the Programming Foundations course, do try to keep up!). It requires about 4000 lines of C code to convert between float and text - no, you didn't want to know that.
Neither am I going to give a blow-by-blow account of each talk - they are available on the europython 2010 website.
There were just under 400 delegates this year - slightly down on last year. The organisers should not be concerned though: early indications from rival conference YAPC Europe (Yet Another Perl Conference) is that attendance is approximately half of last year.
Organisation was slightly better than 2009, a number of lessons appear to have been learnt. Talk streams were still patchy, but I guess that's inevitable.


Multiprocessing, the GIL, and threading
The conference was opened by Russel Winder who impressed last year as well. The theme was very familiar to me, it told the story that Intel, and others, have been banging on about for a couple of years now - the fact that multi-core CPUs are here to stay and currently the only way to get the increase in speed beloved of developers. Russel went much further though, with my brain screaming ME TOO! Where Intel's solutions are distinctly small scale with core numbers in single figures, Russel spoke of much larger clusters (note the term). His contention was that current hardware solutions with cache are not scalable beyond 16, and neither are software solutions using threads. Message passing systems, like MPI, are a more likely future than threading systems like TBB or OpenMP. And no one in their right mind would be programming native threads. Here, here. A thousand times, here, here. All these lessons were learnt on mainframes in the 1970s: those that ignore history are destined to repeat it. It is a great shame that Russel's talk was rushed.

That did not stop discussion in the conference about the vagaries of the Global Interpreter Lock (GIL), the much maligned excuse for not doing multithreading in Python. Hey guys: multithreading is hard, error prone, and not necessarily all that faster. Let's just accept that and move on. There will be a reworked GIL in Python 3.2, work which has come out of Google's "Unladen Swallow" project. It will be interesting to see how much that helps (?) and what excuses people will use to avoid multithreading in the future.

I was browsing one of the book stalls, looking at an Advanced Python book when it hit me how much Python has moved forward in the past couple of years. The section on Multiprocessing mentioned an out-of-date module, not Subprocess and Multiprocessing. I dismissed the book as "out-of-date" even though the first edition was 2008 - QA's own Python courses have always covered those two modules. Then I realised that we have only had our own courses for about a year, although I have been tinkering with Python for (wow!) over ten years.


Python's progress
Version 2.7 of Python was released on 4th July. OK, I'll update the QAPYTH2 course material as soon as someone gives me the time. (Actually many of the significant changes in 2.7 are back-ports from Python 3.1, so I can flitch the material from my Python 3 course. Just don't tell the boss). Python 2.7 is the last Python 2 major release. Version 3.1 is now the only Python development stream, although 2.7 will continue to be maintained for around five years. When I started writing our own Python course material in late 2008 I was interrupted by the Python 3.0 release literally as I was writing the PowerPoint slides. I decided that a new course on Python 2 was daft, and switched to Python 3. It just so happened that I thought that Python 3 was a vast improvement for the language as well (don't cry for me, Perl 6). Six months later at EuroPython 2009 I realised I might have made a mistake, so I back-ported the course material to Python 2. I ended-up with two courses and "let the market decide". One year on and we have taught many more Python 3 courses than Python 2. and we are in a great position to move forward, unencumbered by legacy Python. Maybe it wasn't such a bad idea to target Python 3, but it was scary to be ahead of the curve.

Python 3.2 should be out around the end of the year. One speaker gave January 2011 and another said "before the end of this year". Take your pick. Christmas? Python 3.3 was mentioned a few times, it should include Google's "Unladen Swallow" (yes Ian, it is from "Monty Python and the Holy Grail") with a target of a 5 times performance improvement (European or Asian?). No dates yet.

My impression is that many developers still have not realised the benefits of Python 3, and have no plans to move over. They will.

It looks like Python will overtake Visual Basic in the Tiobe stakes in the next few months (it passed Perl a couple of years ago). Are developers realising the benefits of Python? They are.


The cheese shop (the Python module repository: PyPi)
(Recent course delegate: "but the Monty Python cheese shop had no cheese in it!". Me: "neither has PyPi")

One of the main benefits of Perl (what?) is CPAN. Dave Cross (well-known Perlmonger) has said that this was the main reason for not moving to Python. Well Dave, Python just had its 10,000th module uploaded (round of applause). Actually I'm not so sure that is such a good thing, duplication and crud makes navigation difficult. Still, it's an indication of popularity and progress.


One of the main benefits of Python 2 against Python 3 is the cheese chop. Here too Python 3 is catching up. The module numbers are still only a few hundred, but there major modules are appearing all the time, for example NumPy (numerical processing) was just released for Python 3.


Wot I learnt
Quite a lot, as always. I learnt the expression for such as __path__ is "dunder path". Nice one. I also learnt about the importance of the new unittest module (that'll have to go in) and changes in the way import works for Python 3.2. I discovered how technicians lie about language comparisons to their managers to justify using a cool product (actually, I already knew that). Speakers were at times less than accurate with their comparisons against Perl.

I learnt how HTML5 is going to make Microsoft Silverlight obsolete. Well the speaker did not actually say that, but I can dream.

There were some fascinating statistics, like there are more people in India with access to a mobile 'phone than to a flush toilet (please don't take the comparison further). I would love to use that in a course if only I could find the derivation.

The portability issues of using ActiveX are well-known to anyone using Microsoft's remote desktop (there are no portability issues - it's not portable), but apparently not to the South Korean government.

The spec. for HTML1 was three pages, for HTML5 is 900. That's progress.

Oracle had a stand at the conference, and they were giving out DVD's with developer's resource on for, let's see, Linux, PHP, Ruby, Python, and Oracle VM. Hummm, spot the missing language. Me: "What about Perl then?"; Oracle chappie: "oh, that's only used by system administrators for scripting". Go figure.

And you can write cool games in Python, drive neat little robots, automate PowerPoint slides (I gotta get me some of that, and the robot stuff). And the Python Software Foundation are as cheerfully disorganised as everyone else. As with all conferences, it is always nice to have opinions confirmed, and to be able to say from time to time "actually, I knew that".

Oh, and I saw a great example for "If then else for the lazy" in our UNIX fundamentals course:
./configure && make


Finally...
Python is self-assured, confident, and looking forward. The community is proud of its product, and itself. So they should be.

Thursday, March 04, 2010

Python exercise - alternative solution

I promised to post an alternative solution to a Python exercise for an on-site today. For other interested parties, the code lists unused ports from /etc/services:

import sys

# set the file name depending on the operating system
if sys.platform == 'win32':
file = r'C:\WINDOWS\system32\drivers\etc\services'
else:
file = '/etc/services'

# Create an empty dictionary
ports = dict()

# Iterate through the file, one line at a time
for line in open(file):

# Ignore lines starting with '#' and those containing only whitespace
if line[0:1] != '#' and not line.isspace():

# Extract the second field (seperated by \s+)
pp = line.split(None, 1)[1]

# Extract the port number from port/protocol
port = pp.split ('/', 1)[0]

# Convert to int, then store as a dictionary key
port = int(port)
ports[port] = None

# Give up after port 200
if port > 200: break

# Print any port numbers not present as a dictionary key
for num in xrange(1,201):
if not num in ports:
print "Unused port", num







Here is a smaller solution using Regular Expressions:

import sys,re

file = r'C:\WINDOWS\system32\drivers\etc\services' \
if sys.platform == 'win32' else '/etc/services'

found = set()
for line in open(file):
m = re.search(r'^[^#].*\s(\d+)/(tcpudp)\s',line)
if m:
port = int(m.groups()[0])
if port > 200: break
found.add(port)

print set(range(1,201)) - found

Friday, January 29, 2010

On the takeover of Sun by Oracle

I guess I might as well comment - everyone else and his dog is doing so. A collegue said "A sad, sad day for FOSS". I know what he means, but actually I disagree. Take-overs and mergers will always happen.

It is occasions like this that demonstrate the power of FOSS. FOSS gives us choice, and we can choose to stay with MySQL or migrate to PostgreSQL, which all the smart people were using anyway. IMHO, technically PostgreSQL knocks MySQL into a cocked hat, and the Oracle takeover of MySQL could give PostgreSQL the increase in popularity it deserves. Other databases are available.

Sun blew hot and cold on FOSS anyway, I could never figure out what their policy was. Java probably has too much momentum of its own for Oracle to screw it up, although if anyone can.... Never underestimate the ability of corporates to kill things off by hubris (I used to work for Computer Associates).

There will always be other languages, other databases, other operating systems.

Other hardware?

Back in the 1980s a collegue (Haydn Moston - are you still around?) told me that CISC (Complex Instruction Set Computer) chips could never last beyond 2000, and RISC (Reduced Instruction Set Computer) chips were the only way to go. The i386 familiy is CISC, and Sun Sparc is the most well-known RISC.

Intel are running out of steam and having to use multi-core, ten years out was actually not bad Haydn. I'm not sure that the switch to multi-core is connected specifically with CISC, but the possible loss of RISC machines is my biggest worry with the Oracle takeover. The number of mainstream instruction sets out there is disapointingly small. How I hate monocultures.

Thursday, January 07, 2010

File notification events on Windows

One of my most popular posts is example code for Inotify on Linux. I have also been asked for similar code for Windows, so yer tis:

There are two different interfaces available on Win32, I prefer ReadDirectoryChangesW because it is easier to control:



// ------------------------------------------------------------------
// Clive Darke QA Training
// ReadDirectoryChangesW example
// ------------------------------------------------------------------

#define _WIN32_WINNT 0x0400 // <<<<<<<<>
#include <iostream>
#include <windows.h>

void DisplayLastError( LPSTR lpszText );

// ------------------------------------------------------------------

int main ( int argc, char *argv[] )
{
HANDLE hDir;
DWORD dwReturned;
BOOL bResult;
FILE_NOTIFY_INFORMATION *pNotify;

if ( argc < 2 )
{
cerr << "You must supply a directory name" << endl;
return 1;
}

// Note FILE_FLAG_BACKUP_SEMANTICS, which is the strange
// attribute required to get a handle to a directory.

hDir = CreateFile (
argv[1], // pointer to the file name
FILE_LIST_DIRECTORY, // access (read-write) mode
FILE_SHARE_READ|FILE_SHARE_DELETE, // share mode
NULL, // security descriptor
OPEN_EXISTING, // how to create
FILE_FLAG_BACKUP_SEMANTICS, // file attributes
NULL // file with attributes to copy
);

char Buffer[MAX_PATH] = {0};

while (TRUE )
{
char szAction[42];
char szFilename[MAX_PATH];

bResult = ReadDirectoryChangesW (hDir, &Buffer, sizeof(Buffer),
TRUE, FILE_NOTIFY_CHANGE_FILE_NAME, &dwReturned, NULL, NULL);

if ( !bResult )
break;

pNotify = (FILE_NOTIFY_INFORMATION *) Buffer;

switch (pNotify->Action)
{
case FILE_ACTION_ADDED : {
strcpy (szAction, "added");
break;
case FILE_ACTION_REMOVED :
strcpy (szAction, "removed");
break;
case FILE_ACTION_MODIFIED :
strcpy (szAction, "modified");
break;
case FILE_ACTION_RENAMED_OLD_NAME :
strcpy (szAction, "renamed");
break;
case FILE_ACTION_RENAMED_NEW_NAME :
strcpy (szAction, "renamed");
break;
default:
strcpy (szAction, "Unknown action");
}

wcstombs( szFilename, pNotify->FileName, MAX_PATH);
cout << "File " << dwerror =" GetLastError();" lpmessagebuffer =" NULL;">