Archive for the ‘Uncategorized’ Category

Git notes

June 23rd, 2008 by Eric Albright

Now that I’ve used git for a couple weeks, I thought I’d make a few notes of commands I’ve found helpful.

To make a local branch for development
git checkout -b name_of_new_branch

To commit changes to the local repository (although I usually use the visual gittk for this)
git commit -a

To commit changes back to subversion
git svn dcommit

To uncommit
git reset HEAD~1

To keep a local branch up to date with subversion (use git stash to hide away local uncommitted changes for later)
git svn rebase

To move the master branch up to trunk
git checkout master
git svn rebase

to handle conflicts with merge
git mergetool path_to_file_needing_merge
or
git mergetool -t toolname path_to_file_needing_merge

To remove untracked files (like the temps that get created during a merge resolve)
git clean -n to see what it would do
git clean -f -d (-d if you want to remove untracked directories as well)

Git, Subversion and a CRLF mess

June 23rd, 2008 by Eric Albright

When initializing from WeSay’s Subversion repository, (git svn init -t tags -b branches -T trunk http://www.wesay.org/code/WeSay) I found that I was then told that I had a ton of files that had changed. Turns out on Windows, git has core.autocrlf = true by default — a good thing. But git-svn apparently doesn’t take this into account and if you have crlf’s stored in the svn repository, they will be pushed into the git repository as well. So for now we have a repository that has crlf’s in it instead of just lf’s which get translated depending on the platform. Setting core.autocrlf to false and then doing a hard reset will make this work for now, although not as nicely as we would like. (git config core.autocrlf=false; git reset –hard)

Merging with git

June 12th, 2008 by Eric Albright

Git still doesn’t have good unicode support so to merge unicode files that git has labeled binary, I wanted to use a visual merger. Finally figured out how to do it — add the following lines to config:

[merge]
   tool = tortoise

[mergetool "tortoise"]
   cmd = \"TortoiseMerge.exe\" /base:\"$BASE\" /theirs:\"$REMOTE\" /mine:\"$LOCAL\" /merged:\"$MERGED\"

[mergetool "p4"]
   cmd = \"p4merge.exe\"  \"$BASE\" \"$REMOTE\" \"$LOCAL\" \"$MERGED\"

If you don’t have TortoiseMerge.exe in your path then you can replace that with the full path (c:/Program Files/TortoiseSVN/bin/TortoiseMerge.exe).

Upgrading user settings in C#

June 10th, 2008 by Tim Armstrong

In the course of development we found it necessary to migrate an old user setting into a new one and to then remove it. This brought with it a few problems which I hope to shed some light on below.

In order to get the value of the old setting we used the Property.Settings.GetPreviousVersion() method. Initially we were getting a SettingsPropertyNotFoundException() although the setting was verifiably present in the user.config file. As it turns out we had removed the Property from the Settings designer which removed the Property in the Property.Settings class. In order for Settings to be found, they have to have a property that is tagged with the [UserScopedSettingAttribute] attribute. This tells the GetPreviousVersion() method to look for the setting in user.config. So far so good…

At this point however, the base.Upgrade() method is called to move old settings into the new file. This causes the old, unwanted setting to be moved in right along with all the old settings that we want to keep around. In order to avoid this behavior the [NoSettingsVersionUpgrade] attribute must also be used for the unwanted Property.

public override void Upgrade()
{
string lastConfigFilePath = (string) GetPreviousVersion(”LastConfigFilePath”);
base.Upgrade(); // bring forward our properties that are the
//  same (but also will bring forward LastConfigFilePath)
}

[UserScopedSettingAttribute]
[DebuggerNonUserCode]
[DefaultSettingValueAttribute(”")]
[Obsolete(”Please use MruConfigFilePaths instead”)]
[NoSettingsVersionUpgrade]
public string LastConfigFilePath
{
get
{
throw new NotSupportedException(”LastConfigFilePath is obsolete”);
}
set
{
throw new NotSupportedException(”LastConfigFilePath is obsolete”);
}
}

An enchant provider for LIFT

May 13th, 2008 by Eric Albright

We wanted to allow users to edit their dictionary and use that same dictionary for spell checking. Since WeSay uses LIFT as the file format for the dictionary and keeps that file up to date, all we needed was an enchant provider that can read LIFT files.

I took the spell checking engine I had written a while back, Ascens, and refactored it so that it could read files of various formats. Currently it supports line based and XML based formats. For line based formats, the words are entered one per line. For XML based formats, an XPath expression determines what text from within the file should be selected to constitute correctly spelled words.

Ascens looks for a settings file with the same name as the language identifier that is passed to enchant. Within the settings file, the location of the dictionary and the type of the dictionary are specified. If the type is xml then the xpath expression should be defined.

The following is an example settings file for Ascens referring to a Lift file:

# This is the settings file for Ascens
[Dictionary]
# Type is either xml or line
# for xml you also need to set the XPath
#Type=line
Type=xml

# path to the dictionary
# (can be absolute or relative to the directory that this file is in)
#Path=c:\documents and settings\user\my documents\dictionaries\fr_FR.dic
#Path=fr_FR.dic
Path=..\..\..\My Documents\WeSay\French\French.lift

# XPath gives the Xpath that selects the words to be used as dictionary
# it must all be on a single line
XPath=//entry[not(citation-form/form[@lang='fr'])]/lexical-unit/form[@lang='fr']/text | //entry/citation-form/form[@lang='fr']/text
# this xpath selects the forms with the language id of 'fr' from the
# citation form when there is one and from the lexical unit when
# there is no citation form (it will not select both)

Enchant looks for user Ascens settings files in the following locations:

  1. The ascens subdirectory of the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Data_Dir, if there is one.
  2. %APPDATA%\enchant\ascens, where %APPDATA% is shorthand for the C:\Users\<username>\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\<username>\Application Data\ folder (Windows XP/2000).
  3. The enchant\ascens subdirectory of the directory value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Home_Dir, if there is one.
  4. %USERPROFILE%\enchant\ascens, where %USERPROFILE% is shorthand for the C:\Users\<username> folder (Windows Vista) or the C:\Documents and Settings\<username> folder (Windows XP/2000).

Enchant looks for shared Ascens settings files in the following locations:

  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\ascens\Data_Dir, if there is one. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\ascens\Data_Dir, if there is one.
  2. <enchant>\share\enchant\ascens, where <enchant> is the location of libenchant.dll.

WeSay Tests on Mono Status

April 22nd, 2008 by Eric Albright

One step toward getting WeSay to run on the OLPC is to verify that it can run with Mono. WeSay Tests on MonoWe already reported all the System.Windows.Forms bugs that we could find by running MWF on Windows as documented here. The next step has been to run all the tests under Mono. As you can see from the diagram (that actually lives on our whiteboard) at left, we have found and fixed and reported quite a few bugs that have made the number of failing tests plummet. We’re still not there yet, but I’m making good progress.

Formatting dictionaries with CSS

February 26th, 2008 by Eric Albright

In evaluating CSS as a stylesheet language for formatting dictionaries, I started putting PrinceXML through its paces. I tried what I considered to be Cobuild dictionarythe hardest dictionary layout and while I think I have matched many of the features. The sidenotes are just not going to happen without specialized support for them in CSS. (The closest I could get was a float but of course if you have more than one within a line, they just write on top of each other). That result is here. I then switched to a more typical layout which had no problems at all. That result is here. You can get all the files to reproduce this exercise here.

Types of style

There are really a number items which contribute to the style of a dictionary:

  1. Selection of fields
  2. Order of fields
  3. Textual markup - characters or text that is added before, after, or around items to distinguish a field from surrounding text
  4. Character styles - font changes
  5. Paragraph styles
  6. Page layout - columns

CSS actually allows us to handle items 1, 3-6. (Selection of fields can be handled by setting the display property to none.) All the textual markup in the examples was done using the CSS content property.

CSS3 Selectors

Another interesting behavior of CSS 3 is that you cannot select the first element having a class containing the word ‘pronunciation’:

.pronunciation:first-of-type

You can only use the :first-of-type selector to select the first element with a particular name so a general div and span with class attributes would have to be converted to xml named elements instead. There is a way around this, given that our document will be generated from another format and that is to actually add classes first-of-type and last-of-type. Then the data becomes:

<span class="pronunciation first-of-type">...</span><span class="pronunciation">...</span><span class="pronunciation last-of-type">...</span>

and

<span class="pronunciation first-of-type last-of-type">...</span>

Playing with both the xml and xhtml varieties in IE7 and Firefox 2 shows that both do a much better job with the xhtml over xml.

Column-span

The only other problem I ran into was that Prince does not yet support the column-span property. This ended up not being a big problem since I just wanted the heading to span both columns and was able to work around this by making the first page of the section have a 12cm top margin and to float the heading into this space.

Configuring where Enchant looks for files

February 22nd, 2008 by Eric Albright

Editorial note (April 25, 2008): With Enchant 1.4.1 the strategy for finding dictionaries has changed. I have updated this information to reflect this version.

So far, I have covered how to get started using Enchant and how to set up dictionaries. This post will cover more advanced concepts that let an application developer or a user take more control over Enchant.

Where Enchant looks for providers

Enchant looks for which providers are available when the enchant_broker_init function is called.

Providers can be installed on the machine for all users to share or can be installed for only one user. If Enchant finds a particular provider as a shared provider and as a user provider, the user provider in more than one place, the first one found is used.

Enchant looks for shared providers in the following locations:

  1. The directory value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Config\Module_Dir, if there is one.
  2. Otherwise, in <enchant>\lib\enchant, where <enchant> is the location of libenchant.dll.

The provider location for the user is determined by:

Enchant loads providers from the following locations:

  1. The directory value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Module_Dir, if there is one.
  2. Otherwise, in %APPDATA%\enchant, where %APPDATA% is shorthand for the C:\Users\<username>\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\<username>\Application Data\ folder (Windows XP/2000).
  3. The enchant subdirectory of the directory value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Home_Dir, if there is one.
  4. %USERPROFILE%\enchant, where %USERPROFILE% is shorthand for the C:\Users\<username> folder (Windows Vista) or the C:\Documents and Settings\<username> folder (Windows XP/2000).
  5. The directory value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Config\Module_Dir, if there is one.
  6. <enchant>\lib\enchant, where <enchant> is the location of libenchant.dll.

How Enchant decides which provider to load for a given language

The provider that is used for a given language is determined by the provider ordering. This can be set programatically by using the enchant_broker_set_ordering function. Enchant initializes the ordering by looking in the enchant.ordering file. Later entries override earlier entries. There is a shared ordering file as well as a user ordering file. A user entry overrides a shared entry.

Enchant looks for the shared enchant.ordering file in the following locations:

  1. The value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Config\Data_Dir, if any
  2. Otherwise, in <enchant>\share\enchant, where <enchant> is the location of libenchant.dll.

Enchant looks for the user enchant.ordering file in the following locations:

  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Data_Dir, if there is one.
  2. Otherwise, in %APPDATA%\enchant, where %APPDATA% is shorthand for the C:\Users\<username>\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\<username>\Application Data\ folder (Windows XP/2000).
  3. <enchant>\share\enchant, where <enchant> is the location of libenchant.dll.
  4. The value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Config\Data_Dir, if any

If enchant doesn’t find any ordering files and the ordering is not overridden programmatically then the ordering is system dependent (but I think that means they will be ordered alphabetically by filename).

Where Enchant looks for Ispell dictionaries

Enchant looks for user Ispell dictionaries in the following locations:

  1. Using The ispell subdirectory of the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Data_Dir, if there is one.
  2. Otherwise, in %APPDATA%\enchant\ispell, where %APPDATA% is shorthand for the C:\Users\<username>\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\<username>\Application Data\ folder (Windows XP/2000).
  3. The enchant\ispell subdirectory of the directory value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Home_Dir, if there is one.
  4. %USERPROFILE%\enchant\ispell, where %USERPROFILE% is shorthand for the C:\Users\<username> folder (Windows Vista) or the C:\Documents and Settings\<username> folder (Windows XP/2000).

Enchant looks for shared Ispell dictionaries in the following locations:

  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Ispell\Data_Dir, if there is one. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Ispell\Data_Dir, if there is one.
  2. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Ispell\Data_Dir, if there is one.
  3. Otherwise, in <enchant>\share\enchant\ispell, where <enchant> is the location of libenchant.dll.

Where Enchant looks for MySpell dictionaries

Enchant looks for user MySpell dictionaries in the following locations:

  1. UsingThe myspell subdirectory of the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Data_Dir, if there is one.
  2. Otherwise, in %APPDATA%\enchant\myspell, where %APPDATA% is shorthand for the C:\Users\<username>\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\<username>\Application Data\ folder (Windows XP/2000).
  3. The enchant\myspell subdirectory of the directory value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Config\Home_Dir, if there is one.
  4. %USERPROFILE%\enchant\myspell, where %USERPROFILE% is shorthand for the C:\Users\<username> folder (Windows Vista) or the C:\Documents and Settings\<username> folder (Windows XP/2000).

Enchant looks for shared Myspell dictionaries in the following locations:

  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Myspell\Data_Dir, if there is one.Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Myspell\Data_Dir, if there is one.
  2. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Myspell\Data_Dir, if there is one.
  3. Otherwise, in <enchant>\share\enchant\myspell, where <enchant> is the location of libenchant.dll.

In addition, if OpenOffice (or StarOffice) is installed, Enchant will find the dictionaries that are located in <openoffice>\share\dict\ooo, where <openoffice> is the location where OpenOffice is installed. A user dictionary will be used before a shared dictionary, which in turn will be used before an OpenOffice dictionary.

Where Enchant looks for the Aspell library

Enchant looks for the aspell-15.dll using the following locations:

  1. Using the value found in the registry at HKEY_CURRENT_USER\Software\Enchant\Aspell\Module, if there is one (this value should include the filename and not just the path).
  2. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Enchant\Aspell\Module, if there is one (this value should include the filename and not just the path).
  3. Otherwise, using the value found in the registry at HKEY_LOCAL_MACHINE\Software\Aspell\Path, if there is one, as the path to find aspell-15.dll (this is set by the Aspell installer for Windows).
  4. Otherwise, in the same directory as libenchant_aspell.dll.
  5. Otherwise, it uses the normal Windows search strategy, which includes looking in the path.

Setting up dictionaries for Enchant

February 21st, 2008 by Eric Albright

In my last post, I gave some tips for getting started with Enchant but you really can’t get anywhere until you have properly configured the providers and installed some dictionaries.

ASpell

The ASpell provider for Enchant requires aspell-15.dll. The easiest way to get started with ASpell is to use the installer for ASpell and for dictionaries.

  1. Be sure you have the ASpell provider (you can list it with enchant-lsmod) libenchant_aspell.dll
  2. Download the installer and run it to install ASpell.
  3. Download a dictionary installer from here and run the installer.
  4. Verify that it has been installed correctly by running enchant-lsmod.exe -list-dicts. You should see something like: en_US (aspell) but with the language code for the language you installed instead of en_US
  5. You can also test it using enchant -d en_US -a (again using the language code for the language you installed). Then you can type words which are or aren’t in the dictionary and see suggestions when they aren’t.

It is possible to use ASpell by including the aspell-15.dll in the same directory as libenchant_aspell.dll or it can be somewhere in the path. If you install aspell using the Windows installer, it will write a registry entry that points to where it was installed and Enchant will use that to find the dependency.

MySpell/Hunspell (OpenOffice format)

Enchant doesn’t require any additional dependencies other than the MySpell provider for MySpell dictionaries but it does require you to copy the dictionary files to the right place.

  1. Be sure you have the MySpell provider (you can list it with enchant-lsmod) libenchant_myspell.dll
  2. Download a dictionary that you want: You can get any of the dictionaries from OpenOffice.org.
  3. Unzip (or otherwise uncompress the package) and copy the contents into %APPDATA%\enchant\myspell (you may need to create the enchant and myspell directories the first time).

    %APPDATA% is shorthand for the C:\Users\<username>\AppData\Roaming\ folder (Windows Vista) or the C:\Documents and Settings\<username>\Application Data\ folder (Windows XP/2000). But you can type %APPDATA% in the explorer’s address bar and it will go to the right place.

  4. Verify that it has been installed correctly by running enchant-lsmod.exe -list-dicts. You should see something like: en_US (myspell) but with the language code for the language you installed instead of en_US
  5. You can also test it using enchant -d en_US -a (again using the language code for the language you installed). Then you can type words which are or aren’t in the dictionary and see suggestions when they aren’t.

Note: if you install MySpell and ASpell dictionaries for the same language, the ASpell dictionaries will be used instead of the MySpell dictionaries (this can be changed but I’ll leave that for another post)

If you are feeling really adventurous and would like to create your own, you can see the directions here.

ISpell

Enchant’s Ispell provider also doesn’t have any dependencies (the dictionaries are read directly by Enchant).

  1. Be sure you have the ISpell provider (you can list it with enchant-lsmod) libenchant_ispell.dll
  2. Download a dictionary from here (at the bottom of the page).
  3. Unzip (or otherwise uncompress the package) and copy the contents into %APPDATA%\enchant\ispell (you may need to create the enchant and ispell directories the first time).
  4. Verify that it has been installed correctly by running enchant-lsmod.exe -list-dicts. You should see something like: en_US (ispell) but with the language code for the language you installed instead of en_US
  5. You can also test it using enchant -d en_US -a (again using the language code for the language you installed). Then you can type words which are or aren’t in the dictionary and see suggestions when they aren’t.

Empty dictionaries

An easy way to get spell checking for a language that doesn’t have a dictionary, is to create an empty MySpell dictionary. First, decide on the language code to be used. (You should use the iso639 code or the ietf language tag, for our example we will use qaa, the first of the private use language codes, as the language code). There are two files that are required, the affix file, qaa.aff, and the dictionary file, qaa.dic. They should both be put in %APPDATA%\enchant\myspell.

The qaa.aff file should contain the following line:

SET UTF-8

The qaa.dic file should contain the following line (it’s a zero, the number of items in the dictionary):

0

Of course, you won’t have any items in your empty dictionary so all the words will be marked as misspelled. As you add items to the dictionary using Enchant, the words will be stored in %APPDATA%\enchant\qaa.dic.

Using Enchant in a Windows App: Getting Started

February 20th, 2008 by Eric Albright

The following are notes toward getting started with incorporating Enchant into a Windows application.

Enchant is a spell-checking framework that allows you to use many different spell-checking backends, including Aspell, Hunspell/MySpell, and Ispell.

You can get the source here. Building using MSVC is not difficult once all the dependencies are provided. The full build notes are here.

If you don’t want to bother with building it yourself, you can get the binaries here.

The main library is libenchant.dll. Enchant uses providers (libenchant_aspell.dll, libenchant_ispell.dll, and libenchant_myspell.dll), which are adapters for the various backends. (There are others available but if you want others, you will have to build them yourself.) There is also a .Net binding available (Enchant.Net.dll) that uses libenchant.dll. The Aspell provider only works if you have ASpell installed, while the Hunspell/Myspell provider and the Ispell provider read dictionaries (in the proper format) directly.

By default, Enchant expects to find the providers (the backend adapters) in the subdirectory lib\enchant underneath the location of libenchant.dll.

You can check your setup by running enchant-lsmod.exe. It will list the providers it finds. Run enchant-lsmod.exe -list-dicts to list the available dictionaries.

Next time, I’ll add more about setting up Enchant to use the dictionaries.