Monday, November 27, 2006

A speech at ATIA

I'm giving a speech at ATIA, and have three days to write the seemingly inevitable PowerPoint slides. Does anyone want to help?

The title is "Beyond Access - What's Next for Kurzweil 1000?"

I had to write a few paragraphs when I submitted the proposal - they are below:

"Software based reading programs for the Blind have moved well beyond their original purpose - access to print materials. We'll look at the many other problems that Kurzweil 1000 tries to solve. Although I'll describe current functionality, and answer questions about the current release, I'd like to spend most of the session in a discussion regarding possible features for future releases."

"This year is the 30th anniversary of the introduction of the first reading machine for the Blind. We'll briefly compare the price and performance of today's reading machine compared with the very first, which was a product from our distant corporate ancestor, Kurzweil Computer Products. Less interesting perhaps, but more important, we'll talk about the way in which product capabilities have expanded to address issues that go well beyond converting printed text into audio. I'll provide an overview of some of the newer capabilities of Kurzweil 1000, answer questions about this capabilities, and demonstrate them if that is of interest. Mainly, though, I would like a group discussion about where this product sector is going - what sort of problems should it solve in the future, and for whom?"

"Key learning objectives include:

A broader definition of what reading products for the Blind are generally, ahd what Kurzweil 1000 provides specifically.

A hint of what is likely to happen in future releases of Kurzweil 1000.

A sense that current users and solution providers can directly influence those future releases."

The fact that I'm asking for help here is mainly an indication that I'm still procrastinating - I hate writing these things, though I don't mind doing the presentation itself.

So, does anyone want to suggest some ideas?

Wednesday, November 15, 2006

A Miscellany of other New Features in Version 11

In previous posts, I've gone into a fair amount of detail about new features, with one feature listed per post. Here, I'll list the remaining features - they are pretty modest, but one or another may appeal to you.

An altered TTS Engine.

We've shipped IBM TTS for a number of years. With this release, we are switching to ETI Eloquence. IBM TTS has suffered from a lack of support for the last few years - we expect better of ETI Eloquence - in fact, a number of fixes were made at our request before we would accept the engine.

Recognition Optimization.

This new option works somewhat like scanning optimization, except that it uses the image associated with the current page, and runs it through several different recognition settings.

Table Output in HTML and DAISY.

Previous releases did not support table formatting when you saved documents as HTML or as DAISY. Version 11 supports this.

Read Newly Recognized Pages Setting.

If you are reading your mail, you might like to have reading begin at the top of whatever page you have just scanned. There is a setting that allows you to do that if you wish. You'll find it in the General Settings Dialog.

Searching for Blank Pages. In the Find dialog, you can now search for blank pages. A blank page is one that contains no text, or only spaces, new lines, and tabs.

Enhanced Explore layout.

Explore layout now distinguishes between headers, footers, captions, tables, titles, and normal text blocks. Further, you can choose to save a picture region as a TIFF file. To do so, select the picture in the grid control, then tab to the Extract button and press enter.

A Bigger Recently Opened Files List.

The number of items in this list has been expanded from 5 to 10.

Continuous Reading in ListViews and TreeViews.

Within any dialog that contains ListView or TreeView controls, you can read the contents of those controls by pressing F5. You can stop at any item by pressing F5 again.

Expanded use of Batch Scanning Prefix.

In previous releases, you could use this setting to begin each TIFF file name with a unique prefix when you were scanning images. This was handy mainly because you could differentiate between one set of images and another. You can still do that, but you can also specify a folder name in the prefix, directing image files to a particular location, and (later), processing images from that location when you choose to recognize image files.

Extract All Images.

This is a new option in the File Utilities menu, allowing you to extract the images from all of the pages in a document.

New Features Guides.

At conferences I frequently have someone come up to me who is using, say, version 7 of Kurzweil 1000. He or She will, quite naturally, ask me what is new. To help me answer that question, the Help menu now contains a submenu labeled "New Features". Within that, you will be able to open any of the New Features guides for any of the versions of Kurzweil 1000 that were on your computer.

Suggested Dictionary Lookup.

If you misspell a word when you attempt to look it up in the dictionary, Kurzweil 1000 may suggest possible corrections that are available in the dictionary.

Copy the Entire Definition.

Once you have looked up a word, use Control+W to copy the entire definition to the clipboard.

A Few More Verbosity Settings.

You can now be notified, if you wish, when recognition of a multi-page file is complete, or when creation of audio files are complete. You can also create a chime to be played when continuous reading passes an end of paragraph, and/or a blank line.

Read All Punctuation.

A few dialogs have been changed to always read with all punctuation audible. This is handy, for example, in the edit corrections and edit pronunciations dialogs.

Switch Currently Active Document.

We have added a new item to the reading keypad. Shift+Up Arrow will move you from one document to the next, presuming you have more than one document open.

Stealing Focus after Scanning.

TWAIN Interfaces have a nasty habit of changing the keyboard focus when a scan begins. Kurzweil 1000 tries to compensate for this by saving the keyboard focus when a scan begins, and refocusing the keyboard once the scan ends. This causes trouble if you are using the Scanner Hot Key sequence in another application, and manually change the keyboard focus while a scan is in progress. You can now disable this behavior in the ScanConf diagnostic by changing the value of the setting labeled "Keep Kurzweil in Foreground".

Changed Behavior for Control+T.

Using Control+T, or the "Tools->What Time is It?" menu item, always gave you the time and the date. Now it will do one or the other. Press it once, and you will get the time. Press it again within 10 seconds, and you'll get the date. This is a handier approach, especially when used in conjunction with Control+C and Control+V, which will copy and paste the date or time into your document.

Changed Defaults.

Scanner Threshold now defaults to Dynamic rather than to Static. Language Identification now defaults to Disabled rather than Once per Page. The default Reading and Message Voice volumes are now 80 (some voices exhibit bad behavior when driven at their maximum volume.)

Changes to the Font Properties Dialog.

In previous releases, we combined the bold and italic attributes for a font into one setting called a font style. That's unusual, and it turns out there is a good reason why it isn't done that way. It becomes difficult to, for example, remove the bold attribute of a block of selected text without affecting the italic attribute in the same block. So, we've separated it out into a Bold setting and an Italic setting. Each has two possible values usually: Enabled or Disabled. For the Font Format dialog, though, if you select text that has both states of those attributes, you can end up with a third possibility: Mixed. For consistency, the print properties dialog was changed in a similar manner - its font style setting is replaced with two settings, bold and italic.

Scanning Time Property.

You'll find a new item in the Recognition Properties dialog. It is the first control, and, like everything else there, its a read only text box. It is labeled "Scan Time", and its mnemonic is ALT+N. It contains the time, in seconds, that elapsed between the press of the scan button and the indication to K1000 that the scan was complete. Depending on the scanner, this may or may not include the time it took for the scan bar to return to its home position.

Creating a List of Misspelled Words.

If you are scanning something that contains a large number of unusual words, they are likely to show up in the ranked spelling dialog as misspellings. If you have access to an expert in the subject of the document, that person might be able to help you determine which words are misspellings, and which ones aren't. You can now use Control+C in the misspellings list to copy the list to the clipboard. After that, you'll be able to copy it to a new document that can be sent to the obliging expert.Read Number Setting effects the Message Voice. The Read Number setting, which lets you choose between whole numbers and digits, now effects the message voice as well as the reading voice.

Select Audio Device.

We have had a number of examples where people have lost the ability to use SAPI 4 voices within Kurzweil 1000, because the Audio Device ID has been changed for the speech engine. You can now use the SapiReg diagnostic to correct that problem.

Conversion Settings

When Kurzweil 1000 is asked to open a file, it often reads and converts the file into a temporary file in its own format - KES. If the format of the original file is text, RTF, Braille, HTML, XML, or DAISY, it does that conversion using techniques that we have written here at Kurzweil Educational Systems. If the format of the original file is an image, or is PDF, then the conversion is actually a recognition, and an OCR engine is used. If the format of the original file is something else - Microsoft Word, for example, then a third party conversion program is used. It is told to convert the file into a temporary RTF file. That RTF file, is then converted again by Kurzweil 1000 into KES. And you wondered why it took a long time to open some files?

When you save a file, if you are not saving into the KES format, then a conversion is happening. Again, if the output format is Text, RTF, Braille, HTML, XML, or DAISY, the conversion is done using code that was written (and is controlled) by Kurzweil Educational Systems. If the format is something else, then a two step process is used - K1000 will convert the file to RTF, and then will invoke a third party conversion program to create the final file in the requested format.

Beginning with this release, we have provided a dialog that lets you control some of the details of file conversions. You can access this dialog using the Conversion menu item in the Settings menus. Its just below the Verbosity menu item. It will open the Conversion Settings dialog. This dialog has the usual OK and Cancel buttons at the bottom, and two important list controls at the top. The first is labeled "Action", and allows you to choose between those settings that affect the Opening of a document, and those that affect the Saving of a document. The second list is labeled "Format", and lets you choose among a selection of document formats. Below those two controls, there are a variable number of other dialog box controls. Exactly what they are and what they do is determined by the settings of the first two controls. I'll go through each of the possibilities here.

Action = Opening, Format = Text.

Split Long Pages - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. K1000 looks for form feed characters when it opens text file. If the amount of text between form feeds exceeds some amount, a page break is forced. This setting allows you to disable that action, so that the resulting KES file has no more pages than indicated by the form feeds in the text file. The mnemonic for this control is ALT+"P".

Paragraph Analysis - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. K1000 uses a fairly sophisticated analysis to try to figure out where end of paragraph marks should be placed. The analysis is sensitive to attributes such as first line indent, average text length, the presence of blank lines, and even tries to make sense out of tables, block indents, and hanging indents. If you disable it, you will end up with each line in the original text file being treated as an end of paragraph. This preserves the look of the original text file, at the expense, often, of its editability. The mnemonic for this control is ALT+"A".

Action = Opening, Format = Braille.

Language - a list box, whose possible settings are Default, Danish, Dutch, English, German, Icelandic, Italian, Norwegian, Russian, Spanish, and Swedish. The default is, well, Default. Default behavior is to look at the language supported by the current reading voice, and use it whenever a Braille document is being opened. This setting won't do much if you aren't back translating, but it can be pretty useful if, for example, you know you are opening a Spanish Braille document. The mnemonic for this control is ALT+"L".

Action=Opening, Format=PDF.

Emphasis - a list box, whose possible settings are "Recognition of Images" and "Extraction of Text". The default is "Recognition of Images". The mnemonic for this control is ALT+"E". PDF files are unusual in that they can contain images and text. Unfortunately, they don't always contain text, and even when they do, that text may not contain all of the text that a sighted person would see when looking at the image of a page in the PDF file. When you open a PDF file, the recognition engine extracts the text and, potentially, recognizes the images for each page in the file. If you choose to emphasize the recognition of images, the text will be used to correct minor OCR mistakes, but the bulk of the results will come from the images. This is the default for this setting. Its primary advantage is that you are pretty much guaranteed to get access to all of the text that is represented in the PDF file - regardless of whether it is available as text from that file. There are, however, a few disadvantages. It is usually slower, and, if all of the text was there, it is likely to be less accurate. The alternative setting is "Extraction of Text". If text data is available for a page in a PDF file, that data will be trusted. Recognition will be done only to associate the text data with the image data on the page. Note that if no text data is available for a page, the image will be recognized and the results of that recognition will be made available to you. The advantages of this approach include both speed and accuracy. However, if portions of the page contain text represented only as an image, those portions will be ignored. It may be difficult for you to tell, when you read the page, that portions of it are missing. Note also that this setting interacts with your choice of recognition engines, and somewhat different results will result depending on which engine you choose, and which treatment you choose to emphasize.

Action = Opening, Format = RTF.

Split Long Pages - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. RTF files may already contain page breaks, but K1000 will insert additional ones if the text of a page, in its assigned font, wouldn't fit on a 14 inch printed page. By disabling this setting, you can make sure that the number of pages in the opened file matches those that exist in the original RTF file. The mnemonic for this control is ALT+"P".

Action = Opening, Format = Other.

Use Microsoft Office for Conversions - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. Microsoft Office comes with a conversion service that can convert documents in a number of different formats to RTF. From there, Kurzweil 1000 can convert the file from RTF to KES. This conversion package is usually a better choice than its alternative - a conversion service from another vendor that comes with Kurzweil 1000. However, if the conversion service from Microsoft has not been completely installed, our attempt to use it will bring up an unvoiced dialog from Office, asking you to complete the installation. If you do not have a screen reader running, this may look as though K1000 has hung. In this circumstance, it might be better to disable this setting. The mnemonic for this control is ALT+"M".

Action = Saving, Format = Text.

Add a Blank Line after each Paragraph - a list box, whose possible settings are Enabled and Disabled. The default is Disabled. One of the problems with plain text as a format is that it does not have a specific character used to mark an end of paragraph. Most of the settings for saving text have to do with trying to overcome, in one way or another, that limitation. If you enable this setting, each paragraph ending will always be followed by a blank line. If you do not enable it, paragraph endings will be followed by blank lines only if that is the case in the original file. The mnemonic for this control is ALT+"B".

Indent the First Line of each Paragraph - a list box, whose possible settings are Enabled and Disabled. The default is Disabled. If you enable this setting, the first line of each paragraph will begin with a tab, or with a certain number of spaces. If you do not enable it, paragraphs will have first line indentations only if the first line in the original paragraph begins with a tab. The mnemonic for this control is ALT+"I".

Spaces used for a First Line Indent - a text box. Possible values are the numbers 0 through 10. The default is 0. When zero, this setting indicates that a first line indent should be created with a tab character. Otherwise, the setting indicates the number of spaces to be used. This setting has no effect if the first line indent setting above it is disabled. The mnemonic for this control is ALT+"S".

Line Endings - a list box, whose possible settings are Preserve, Remove, or Wrap to Fit. The default is Preserve. When set in this manner, each line in the text file will have the same length as the original scanned lines - assuming that they were scanned by Kurzweil 1000. When set to Remove, each text line will be equal to a paragraph. Needless to say, this can create rather long text lines, but most text processors can automatically wrap long lines to fit within the width of the display window. Finally, the Wrap to Fit setting, which interacts with the maximum width setting that follows, will pretty much ignore the original line endings, but will introduce line endings as necessary to keep each line within a paragraph under a particular maximum limit The mnemonic for this control is ALT+"L".

Maximum Width of each Text Line - a text box. Possible values are the numeric range 30 through 250. The default is 80. This setting is important only if Line ENdings are set to "Wrap to Fit". It establishes what can be considered a margin for the document. Line endings will be added to keep lines under the number specified here. They can exceed that number only if a word has a length larger than the length specified here. The mnemonic for this control is ALT+"M".

Action = Saving, Format = Braille

Type of Braille - a list box, whose possible settings are Grade 1 and Grade 2. The default is Grade 2. This setting take effect whenever a Braille document is saved. The mmenonic for this control is ALT+"T".

Language - a list box, whose possible settings are Default, Danish, Dutch, English, German, Icelandic, Italian, Norwegian, Russian, Spanish, and Swedish. The default is, well, Default. Default behavior is to look at the language supported by the current reading voice, and use it whenever a Braille document is being written. The mnemonic for this control is ALT+"L".

Action = Saving, Format = Other

Use Microsoft Office for Conversions - a list box, whose possible settings are Enabled and Disabled. The default is Enabled. Microsoft Office comes with a conversion service that can convert RTF documents to a number of different formats. This conversion package is usually a better choice than its alternative - a conversion service from another vendor that comes with Kurzweil 1000. However, if the conversion service from Microsoft has not been completely installed, our attempt to use it will bring up an unvoiced dialog from Office, asking you to complete the installation. If you do not have a screen reader running, this may look as though K1000 has hung. In this circumstance, it might be better to disable this setting. The mnemonic for this control is ALT+"M".

This new dialog caused a few other changes. First, we have removed the maximum text length setting from the General Settings dialog, since we've replaced it with a few different settings in the text saving section of the conversion settings dialog. Second, the dialog for saving partial changes has a new value in the list of possible settings categories: Conversion.

Insert Signatures into Documents

I mentioned signatures with the forms recognition feature. You can scan signatures into the system, save any number of them, and insert them into any open document.

How do you create signatures? In Kurzweil 1000, with no files open, you will find another new menu item at the bottom of the Scan menu list. It is named "Create a Signature File..." It will let you scan in a signature. To do so, first get a blank, white, sheet of paper. Use a guide if necessary, and enter your signature in the top left area of the page - not too close to the corner, and as straight as possible. Place that sheet on the scanner so that the orientation will be right side up in the resulting image - we can't automatically orient a signature. Use the Scan a Signature menu item to scan in the page. An image of your signature will appear on the screen. If it is acceptable, enter a name for it.

Once you have some signature files, you might find that you'd like to insert the signature in a document. That's straightforward. You'll find that the last item in the Edit menu is "Insert a Signature". Its mnemonic is "I". It opens a dialog box, that lets you choose from the available signature files. Pick one, press enter, and it will be inserted into your document at your cursor position.

Those of you with some sight might notice that the signature is a white rectangle, with the signature in black. This might look odd on the screen, as our default text display shows a black background. Suffice it to say, most paper is white. When you print the document, your signature should look fine.

Tuesday, November 14, 2006

Bilingual Dictionaries

Kurzweil 1000 comes with the American Heritage Dictionary, 4th Edition. You can also get the Concise Oxford Dictionary for it, if you'd like. With this version, we include 12 pairs of bilingual dictionaries - useful if you want to get a short definition in English for a Spanish document, for example, or the French word for an English one.

The following dictionaries are available, assuming that you choose to install them.

Larousse Concise French to English
LSI French to English Concise
LSI French to English Large
Larousse Concise English to French
LSI English to French Concise
LSI English to French Large
AHD Spanish to English Small
LSI Spanish to English Concise
LSI Spanish to English Large
AHD English to Spanish Small
LSI English to Spanish Concise
LSI English to Spanish Large
LSI Dutch to English Large
LSI English to Dutch Large
LSI German to English Large
LSI English to German Large
LSI Italian to English Large
LSI English to Italian Large

You choose a dictionary by going into the Dictionary Lookup Dialog, and using the list labeled "Dictionary Source". You'll find that the Part of Speech control is not available for any of the bilingual dictionaries.

When you look up a word, you'll get a brief definition or group of definitions in the other language. Words will be automatically spoken by a voice that is capable of handling the language.

The definitions are quite terse, and this is not a particularly good way to get a translation of a block of text written in a language that you don't know at all. It is, however, a handy way to look up a few terms in a language that you are familiar with, when you are reading text in a language with which you are not quite so familiar.

Writing Files onto a CD

You've been able to create audio DAISY documents for a few releases now, but had to use other software to actually write those documents to a CD for use in a portable player. Now you can write them onto your CD from within Kurzweil 1000.

This functionality is available only if you have Windows 2000, Windows 2003, Windows XP, or Windows Vista. If you are using an older operating system, or you don't have a CD Rom drive that will write CDs, the appropriate menu items will not be available.

For the rest of you, you'll find a menu item at the bottom of your Tools menu titled "CD Writing Tasks". Its mnemonic is "C". Within it, you'll find six or seven submenus. Each will open a dialog. They are, in sequence, Add Files, Remove Files, Start Writing, Status, Erase CD, Directly Write, and Select A Drive. Mnemonics are "A", "R", "W", "S", "E", "D", and "L". All of these menus bring up dialogs. The last of them, Select A Drive, is available only if you have more than one drive that is capable of writing to a CD.

Add Files will bring up the K1000 file dialog box, and allow you to select a folder, or any number of files. These are files that you will be adding to a queue. The queue is just a folder on your hard drive that will contain copies of the files and folders that you have selected.

Remove Files will once again bring up the K1000 file dialog box. This time, though, you are positioned within the CD Writing Queue. Again, you can select a folder or any number of files. This time, though, the files are being removed from the queue.

Start Writing will begin the process of writing the queued files onto the CD. You should have a writable CD in the drive when you begin this process. A dialog box will be brought up which contains one read only text box, an OK button, and a Cancel button. As the writing progresses, the text box will show a percent done. You can press enter and exit from the dialog if you wish - the writing process will continue while you do other things. If you'd like, you can press escape or activate the cancel button. You will be asked if you are sure that you wish to cancel the write. Answer in the affirmative, and the CD writing process will be canceled at the next opportune moment.

If you have not begun writing a CD, the Status dialog will tell you how many files you have queued to be written, and the total size of those files. It will also tell you whether or not a CD is in the drive, and if one is, how much free space is available on that CD. If writing is in progress, it will bring you back to the small dialog box described above.

The next menu item, Erase CD, will erase all contents on your CD, provided you have the appropriate type of CD. Like the Start Writing menu item, it will bring up the status dialog while the erasure is in progress.
You can use the Directly Write menu item to select a folder, and immediately begin writing the contents of that folder into the root drive of the CD. The dialog that comes up allows you to select the folder. Once you accept your selection, it is read, and then written. This does not effect the queue of files or folders that you intend to write to your CD, which you might have created in other menu items. This option is particularly useful for copying CDs.

If you have multiple drives that can be used to write CDs, the final menu item, Select a Drive, will let you choose which one you wish to use. It will bring up a list, showing the paths of the available drives. This, by the way, is a saveable setting.

One last thing. If you choose to create Audio Files, you can choose the path of the writable CD if you'd like. Kurzweil 1000 will automatically write the audio files into the queue. Once audio creation is finished, you can use the Start Writing menu item to complete the process.

Linking Documents and Settings

Kurzweil 1000 allows you to save almost all of the various user settings in named settings files. But it can be pretty annoying when you realize that you have forgotten to save those settings, or when you have forgotten the name of the settings file that you used. People work hard to come up with optimal scanner settings for a particular document, for example, but may not be able to complete the scanning of that document in one session. If they forget to save their settings, it can be difficult to remember them or recreate them at a later time.

There is a new setting in the configuration settings dialog called "Link Documents and Settings". Its mnemonic is "I". It has three possible values: Disabled, Scanning Settings Only, and Most Settings. Its default value is disabled. When disabled, settings behave pretty much as they have in previous releases, except that special settings files, whose names are based on the names of your documents, are created whenever you close a document, and even when you change the currently active document if you changed a setting in that document.

If you change the configuration setting such that scanning settings are linked to documents, then scanning, recognition, and scanner margin settings are loaded whenever you open an existing document and scan a new page into it. This can be quite useful, in that you no longer have to remember to save or load those settings yourself. It can, however, be confusing. If you open an existing document, and then change some scanning, recognition, or scanner margin settings, and then scan a page, you'll lose those settings, as your older settings will be automatically, and silently, loaded.

It gets even more powerful, and potentially more confusing, if you link most settings to documents. In this case, voice, reading, general, display, and verbosity settings, along with scanning, recognition, and scanner margin settings, will all be automatically loaded whenever you open an existing document, and whenever you switch from one open document to another. You can have one document, for example, read with NeoSpeech Kate at 180 words per minute, while another uses Reed at 240 words per minute. All sorts of things start happening automatically. It can be fun. If you change a setting while one of those documents is open, the linked settings file will be automatically saved when you close it, or when you switch focus to another document.

Suppose, though, that you eventually find a voice that you like better than any other, and you want to update all of your settings files. If you did so without document linkage, you could just use the "Save Partial Settings" feature, indicate that you want to just save voice settings, select all of your settings files, and update them in one simple operation. If that didn't effect linked settings files, you would find, though, that your old voice settings keep coming back whenever you open an existing document. To prevent that, you'll find that the Save Partial Settings operation can also let you change all of your linked settings files.

Monday, November 13, 2006

Access to Audio Files

One odd thing about the last few releases of Kurzweil 1000 was that it could create files that it didn't know how to read. Using the Create Audio Files facility, which can be found under the File->Utilities menu, you were able to create MP3, WAV, or Audio DAISY files. This was a very useful feature for taking those files elsewhere and reading them at your leisure, but it was a little peculiar to not be able to play them within Kurzweil 1000. Now you can.

You can open MP3, WAV, WMA, or Audio DAISY files within Kurzweil 1000 and play them. Kurzweil 1000 treats them as though they were open documents, and many of the same keystrokes apply. Control+Home will take you to the start of the document, Control+End to its end. The commands that move backward, forward, or repeat the current unit work pretty much as you'd expect, except that the units are blocks of time rather than blocks of text. Specifically, of the reading unit is line or sentence, it equates to 15 seconds in an audio file. If the reading unit is by paragraph, it equates to 30 seconds. You can, of course, stop reading and restart it.

You can also change the reading speed. The default reading speed for an audio file is set, somewhat arbitrarily, at 150 words per minute. At that speed, you will be hearing the expected speed of the recording. You can speed it up with F12, or slow it down with F11, until you reach 300 words per minute, or 75 words per minute. This works reasonably well for recorded text, but it can sound more than a little strange if you are playing music.

You can also set and use bookmarks, create and read notes, and, of course, expect Kurzweil 1000 to maintain your current reading position between sessions.

Note that if you wish to open a DAISY 2 Audio document, you should find and open the file ncc.html.

Bookmarks and Notes in All Document Types

For many releases, clients have been able to create bookmarks and notes in open documents within Kurzweil 1000. Further, they've been able to close documents and know that they will be positioned at the spot where they were last reading when they open them again. However, this only worked for documents in Kurzweil's native format - .KES, or in DAISY 3 text documents (whose extensions are .OPF). If you added notes or bookmarks to a document, and then saved it in some other format, when you reopened it you would find that those various annotations were gone. Since those documents were written in formats that are not controlled by Kurzweil Educational Systems, and since the converters for those formats (in general) were not controlled by Kurzweil Educational Systems, we couldn't preserve those features.

Now we can. Beginning with version 11, you can create and preserve bookmarks or notes, as well as your last reading position, with any kind of document, as long as that document can be opened in Kurzweil 1000.

All of this extra information is kept in a database that is maintained by K1000. That database is backed up whenever you backup your settings, and restored whenever you restore them. One consequence of this is that you can't easily share your bookmarks and notes - they aren't really a part of the file that you are annotating.

The "key" that is used to lookup information in this database consists of the file name and extension, and the file size. Note that the file name does not include its full path, so you can move a file from one folder to another and Kurzweil 1000 will still be able to find bookmarks, notes, and reading position information for it. What you can't do, though, is change the file name or alter the size of the file outside of the Kurzweil 1000, and expect to still retain access to this extra information.

Your reading position is saved in the database when you close a file. Note that we use the database approach for all files other than KES files - largely because we had to completely rewrite DAISY files to save the reading position, and this approach is much faster. Bookmarks and notes are saved in the database only when you save a file in a format other than KES or DAISY.

Friday, November 10, 2006

Scan and Recognize from within Microsoft Word

The requests for this feature have perplexed me for years. Kurzweil 1000 costs more than commercially available, mainstream OCR packages. What people are paying for, presumably, is the inclusion of all sorts of features that make sense mainly for the Blind, and a seamless audible interface. If you scan from within Microsoft Word, you pretty much lose all of that. What I finally came to realize is that the people who wanted this feature didn't want to use it constantly - instead, its a convenience when you happen to already be in Word, and you happen to want to work from material that is in print. So, this time around, we added the feature.

Suppose you are in Microsoft Word, and you wish to scan a page. At the bottom of the Insert menu, you will find a menu item called "Kurzweil Scan and Recognize". Activate that menu item, and a page will be scanned and recognized. The results will flow into your Word document, beginning at your current cursor position. Kurzweil 1000 doesn't need to be open.

You can adjust certain scanning and recognition options from within Microsoft Word. To do so, activate the "Kurzweil Options..." menu item near the bottom of the Tools menu. A dialog box will come up. Its contents include a list of scanners, so that you can select the scanner source, a slider for your brightness setting, a list to select the scan type (black and white, grayscale, or color - note no dynamic thresholding here!), a list for the OCR engine, a check box to enable or disable column identification, another checkbox to enable or disable despeckling, and a list of possible recognition languages.

That's about it. Its important to note that there are good reasons to think that this is not the way you want to do all of your scanning. Important settings might be missing at the moment, but the main thing is that you can't read and edit the document at the same time that you are scanning it. Remember that the recognition results will come in starting at your cursor position. Fiddling with that position while recognition was in progress would cause a lot of trouble.

We have gotten this feature to work with Word 2002, Word 2003, and Word 2007. It does not work with Word 95 or Word 97.

Wednesday, November 08, 2006

An Appointment Calendar Application

We've been meaning to write an appointment calendar application for a long time. Early in the planning stage for each release, this would be fairly high on our priority list. Each time, though, I'd throw so many requirements at it that it would become too time consuming to do. For example, I wanted a way to include holidays, and I wanted holiday categories that included dates drawn from one or more lunar calendar. I wanted it to be able to synchronize with PDAs. I wanted it to provide multiple mechanisms for reminding people about appointments.

This time, I finally learned my lesson. Make it really simple, so there was at least half a chance that we would get it done in time for the release. Add items that our pilot customers thought were absolutely essential. Expect to add more items over time, in future releases. So, you'll find that this is a relatively simple, though useful application. It is not for those of you who have mastered the intricacies of Microsoft Outlook and synchronize everything to each of your several PDAs. Instead, its a straightforward way to create appointments and to be reminded of them.

You can start the appointment calendar application in one of two ways. Like other K1000 Applications, you can use the File Launch facility if you'd like. Get into the File menu, select the Launch item, and then the Appointment Calendar (the mnemonic is "M"). Its easier, though, to use the hot key that is associated with the Appointment Calendar. By default, it is Control+Alt+A. You can use that without first running Kurzweil 1000 if you'd like.

Please note that the Appointment Calendar is a separate talking application. As a consequence, you are likely to need to configure your screen reader to stop it from speaking at the same time.

If this application is going to have to remind you of any appointments, it needs to be running. If you create appointments for it, it will be started in the background whenever you restart your computer.

Once you have run the application, you will find that it has a menu bar with three separate items: File, Tools, and Help. Use the Help About item to learn about the functionality of the application. You'll find its pretty straightforward. The most complicated area is creating or editing an appointment. Use File->New to create one, File->Edit to modify one, File->Delete to (of course) delete one, and File->Duplicate to copy one. Copying an appointment is useful if you want to make several appointments, where most of the appointment properties are similar.

The dialog that you will use to create or modify an event begins with a combo box titled "Name". This can be a little misleading. You can, in fact, name an appointment here in any manner you wish, but in some ways this is more like a category, or template, for applications. You'll find that some have already been created for you, which you can access with the up and down arrow keys. These "names" include Anniversary, Appointment, Birthday, Daily Event, Holiday, Meeting, Monthly Event, Phone Call, Reminder, Weekly Event, and Yearly Event. As these names might suggest, they influence the default settings for the rest of the dialog - particularly whether or not an event recurs, and how often it recurs. The next control is a list box labeled "Recurrence". You can use it to decide if an event occurs only once, if it happens monthly or yearly, or if it happens daily or weekly. The next cluster of choices lets you specify the time of the event. After that, we get to the date choices, which are influenced by whether or not the event recurs. If it does not recur, you simply specify the day, month, and year. If it recurs monthly or yearly, you will find that there is a check box for each month of the year (the labels sound a little odd, by the way, because we are using just the first three letters of each month). Each month can be checked or unchecked. If checked, you are indicating that the event can occur in that particular month. After those twelve check boxes, there is a list box that lets you cause the event to repeat on a particular day of the month, or on a particularl day of the week. If, however, the recurrence field was set to daily or weekly, you'll have seven check boxes - one for each day of the week. After that, there will be a list box which lets you choose between repeating the event every week, every second week, every third week, or every fourth week. Beyond that, there is a text box that lets you enter details about the particular event. These details will be read to you when you are reminded about the appointment.

Finally, there are a group of controls regarding how, or if, you would like to be reminded about the appointment. First is a check box - check it if you want a reminder. Then there is a numeric text box where you can enter a number, followed by a list box that specifies what that number is for: minutes, hours, or days before the appointment.

That's about it, really. If you have an appointment, you want to keep this application running in the background. Use the escape key to dismiss it, and return to your main application. If you would rather cause it to exit altogether, use File->Exit.

When you need a reminder, the appointment calender will bring up a dialog box, and the contents of that box, which contain your comments regarding the appointment, will be spoken. You can also specify a wave file that will be played when reminders occur.

Monday, November 06, 2006

Updated OCR Engines

We currently ship with Optical Character Recognition technology developed by two different companies: Abbyy, which makes FineReader, and Nuance, which makes OmniPage. In general, the commercial versions of OCR products from those companies are released first. Later, sometimes much later, they package those technologies as toolkits that other companies, such as ours, can license and use. Both of those companies provided a new version of OCR technology in time for Version 11 of Kurzweil 1000 - FineReader Version 8, and ScanSoft Version 15.

We've taken advantage of some new features in both products, giving us some new functionality for analyzing forms (see an earlier post) and for opening PDF files. Mostly, though, we were interested in recognition improvements. We do see some, though we haven't done a large scale analysis yet of the differences between earlier versions and this one in terms of recognition accuracy.

One important point - the latest version of ScanSoft OCR no longer supports Windows '95, Windows '98, or Windows ME. If you use any of those operating systems, we will install an older version of ScanSoft OCR - version 12.6 - rather than the current one. As a consequence, form recognition will not be available.

Both engines now provide some speed control - that is, you can ask the engine to recognize something quickly, perhaps at the expense of the accuracy of the recognition. Some of you may remember that we listed ScanSoft's recognition engine twice in our previous release - one listing for speed, the other for accuracy. Now you will find it listed only once, because a separate setting applies to both recognition engines. That setting is labeled "Recognition Approach" - the setting options are "Accuracy" or "Speed". You'll find it immediately before the Engine setting in the Recognition Settings dialog. You'll also find that we removed two settings: Recognition of Light Text on a Dark Background and Questionable Character Markup. The former is not controllable in some engines, and is now always enabled. The latter wasn't possible with one of the new engines, and seemed to be used rarely, if at all.

While I'm on the subject of recognition settings, let me mention an important one. I'll talk about it more in a post on conversion settings, but it really has to do with character recognition. In the Conversion Settings Dialog, you'll find one setting that has to do with opening PDF documents. The setting is labeled "Emphasis", and your choices are "Recognition of Images" and "Extraction of Text". PDF files are unusual in that they contain both text and images. Sometimes they have no text, but they do have images of text - this happens most often when the person who created the PDF file used a scanner to make it. Sometimes they have text for portions of a page, but not for all of it. Sometimes the text for the full page is available, but it is not clear how it should be ordered. Both OCR engines can extract the text, if it is there, and use it. If you indicate that Extraction of Text should be emphasized, they will use the images only to establish the position of the text and its reading order. If you indicate that Recognition of Images should be emphasized, the text will be used only on a word by word basis to correct simple recognition errors. Recognition of Images is the default, although, if your PDF file has all of its text, it will be slower and less accurate than the other option. Unfortunately, if your PDF file contains images of text for which there is no text that can be extracted, the other option can cause entire sections of a page to be skipped.

Although we haven't independently verified the vendor's claims, I thought you might be interested in their claims about improvements in their new OCR engines.

For Abbyy, see http://www.abbyy.com/sdk/?param=35469
For Nuance, see http://www.nuance.com/omnipage/capturesdk/whatsnew.asp

I have reproduced some of the more pertinent claims here.

Abbyy has added a "Fast Mode", performing recognition up to two times faster.

Intelligent image analysis in FineReader Engine 8.0 delivers higher recognition accuracy. FineReader technology automatically adjusts its algorithms to account for image condition, resulting in increased accuracy by up to 30% on low resolution documents (scanned at under 200 dpi or faxes).

Abbyy also claims that their PDF processing is up to two times faster, and often more accurate, as they do a better job of analyzing the internal information within source PDF files, including annotations, metadata, text objects, font dictionaries, and content streams.

Nuance suggests that its newly developed 3-way voting system provides a 36% increase in accuracy over previous versions.

Your mileage, as they say, may vary. We'd be interested in what you think once you've taken version 11 out for a spin.