This version corrects a bug where captioned items were skipped in a slideshow of only captioned items.
Blog
What can Automatic Classification Bring to Records Management?
Records management is an unglamorous but essential part of the operation of any organization. The computerization of the workplace brought about some paradoxical changes (see Computers and the Death of Recordkeeping) and the rise to prominence of Artificial Intelligence has resulted in renewed interest in automatic classification replacing the historical role of the filing clerk.
Google and Organizational Search
The spectacular success of Web search engines such as Google in retrieving desired information naturally makes people wish they could replicate its success within organizations that they work for. Google realized this too and the history of the Google Search Appliance (which reaches its official end of life in 2019, after 17 years) illustrates their ultimately unsuccessful attempt to provide this. The basic reason for this is that there are seldom any hyperlinks to use for ranking search results within organizations. Ranking tends to be by keyword frequency , which gives poor results. No other search vendor has done any better, but a number of specialist Records Management companies are tackling this problem with automatic classification – assigning documents to file plan categories using automated methods. As paper records management gives way to digital, the problems of effective management of organizational records become more obvious and vendors are jumping into the opportunity.
BORN Digital
One problem at least is becoming less serious – as born-digital documents extend further into the past, the task of extracting machine-readable text from scanned document images is diminishing, and technical improvements in optical character recognition mean that extracted text from scanned documents is more likely to represent what was written than in the past. However, the task of grouping and classifying documents in the manner in which filing clerks used to operate has not seen a similar improvement.
Automatic Classification
There are two main methods of automatic classification: rule-based and training set based. Rule-based classification has a long history (back to the 1970s) and is simple to apply and understand. The occurrence of a word or phrase in a document, or words or phrases in proximity to each other is sufficient to indicate membership of a particular class. In the simplest case, a single occurrence of word or phrase is sufficient to place a document into a particular class. This approach is used by products intended for tagging rather than records management. Variants of this approach use statistical rather than binary classification, and may perform deterministic processing of text (such as word stemming) to improve performance. The task of defining a set of rules may be purely manual or may use a training set of already classified to identify words and phrases occurring more commonly in a particular class to generate rules semi-automatically.
Why AI can’t help much within organisations
Modern Artificial Intelligence classification may use neural nets to perform classification, treating words and phrases simply as tokens in the same way as features extracted from images. Such methods require large training sets and the rules generated are unknowable, and cannot be easily edited. The token –based analysis means that if words in text content are randomly re-ordered, turning the document into gibberish, the classification result remains unchanged, except for effects caused by the disruption of phrases. As more documents are classified, and the classifications approved, the number and quality of rules may increase, allowing vendors to claim that that their systems learn from experience. However, the lack of transparency of the rules used by networks means that the content of documents unrelated to its meaning (such as an organization’s address) may end up forming the basis of a classification rule.
Non-explicit content
Another problem with content-based classification is that the most important classification features of documents created for a small audience are not explicitly mentioned, as anyone viewing the document was assumed by its creator to know what it was about. A specific example from one company was a document relating to problems encountered with an Oracle database upgrade. The content never mentioned the word “Oracle” and only referred once to the version number of the database. The document name and provenance often are a much better guide to classification than the text content.
If google translate is so good, why can’t it classify my documents?
The increasing power of computers, and particularly the availability of massive cloud resources, masks the fact that automatic classification (and machine translation between languages) is based on statistical analysis of words and phrases, not on their meaning, as extracted by any human reader. The nuances of language are such that even the detection of negation in a text is a PhD topic, like deciding whether the sentence “Time flies like an arrow” is a statement about a type of fly or about human experience. The spectacular success of Artificial Intelligence in playing rule-based board games such as chess and Go does not involve any automated understanding of the complexities of language use, and its success in quiz games is based more on good database design than anything else. Reputedly, when IBM’s Watson was asked “Who was the first woman in space?” its answer was “Wonder Woman”, as it had no concept of the distinction between fiction and non-fiction.
Notwithstanding the limitations of statistical classification, any type of classification is usually better than none, even with a high error rate, but expectations should be tempered. The problem of language understanding may well be cracked in the future and automated filing clerks will become available, but we are certainly not at this stage yet.
Why it’s Difficult to Caption Videos

Text superimposed on video content is everywhere: YouTube even offers to caption videos by transcribing the audio to text and write it synchronously over the video. Most video sources (including DVDs) have an option to show subtitles any one of a number of languages. You can find the distinction between subtitles and closed captioning here. Closed captions are optional – you can view them or not, whereas subtitles are part of the video.
Videos Have More information Than Images
If text over video is so common, why is it difficult to do? The short answer is that videos contain much more information than images. Placing text in a video requires re-creation of all the content. This requires much more computation than re-creating a single image. However, showing a caption only requires storing the text, the time it is to be shown and the location on the screen. This information can be easily encoded in a file and rendered by the video player software. This is how most text on video is displayed. One problem is that different video players use different file formats for the data. Another is that the file is separate from the video file. If you download a video that has been auto-captioned from YouTube using a 3rd party application any auto-captions will not be included.
Web-Based Facilities
Web-based facilities for displaying remotely stored video files, such as YouTube, can ensure that all videos are displayed using a video player that supports the display of separately stored captions. However, the separation of the captions from the video file means that the caption data is not easily available, or not available at all. YouTube has a takeout facility. This allows users to download all their YouTube video content, including a JSON metadata file for each video file. This file includes many metadata fields for videos but does not include the caption data.
Using a separate file for text and video is great for flexibility, but does require that the file be kept along with the video content. A further problem is that not all video players support all the available caption file formats. Perhaps some future video format will allow incorporation of text captions as metadata of the main video file. However, future video players must be able to read it! The default Windows 10 video player, Photos, supports a number of caption file formats and there are many online facilities for generating them, some of which are reviewed here.
Making Captions readable on any Player
So if you want to ensure that you can caption videos so that the captions are readable on any video player and are embedded in the video data, what are the options? If you want to keep the entire original video frame and place the caption beneath it, then the video needs to be padded out with a uniform colour bar below the video frame. This can be done using the Windows command line application ffmpeg. A complexity of this operation is that portrait mode videos from smart phones may be padded with black at the sides to make the video frame the same dimensions as in landscape mode.
Copying from Analogue Media
Videos copied from analog media such as Hi 8 or VHS cassette tapes may have similar padding added. The caption can then be written on the uniform colour bar or on top of the video, either using a web-based service such as Kapwing Subtitler or a desktop video editor such as Photos for Windows 10. Photos does not offer the flexibility of caption font, colour, and position selection offered by Kapwing Subtitler, but it is simple to use and available as part of Windows 10. Using a desktop application is likely to be much slower than using a cloud-based facility. These can apply more computing resources than are available on the average domestic computer to the task of reading video frames, adding text and re-encoding the video.
Caption Pro 2.2.86 Released
This release of Caption Pro extends captioning capabilities to include videos as well as images, and gives you the capability to present a sequence of images and videos on a Windows desktop in the same way that you do on a mobile phone, and with captions below each image and video to keep all of your precious pixels intact
Caption Pro Web Updated
Caption Pro Web has had a major makeover after reviewing how users have been interacting with it. The interface now support loading by dragging files or pasting and is now a lot simpler (I hope). Captioned images can be downloaded by right-clicking on the image as well as using the Download button. And it’s free to use – no registration needed.
Caption Pro 2.1.85 Released
This new version of Caption Pro lets you load files by drag and drop, and paste individual images into the program. It also lets you sort loaded files by file name, Modified date and Date Taken (usually but not always the same) and re-order collections of images manually so that you can group your photos in any way and integrate photos from different digital devices.
Notes:
1) If you check Start Program on the last screen of the installer, Caption Pro will be run from an Administrator account and you will not be able to drag and drop files onto it. If you run it after installation from the desktop icon or the Aleka Consulting->Caption Pro program entry, drag and drop will work OK.
2) Caption Pro 2.1.85 does not run correctly on Windows 7. This problem is under investigation.
Aleka Websites Now Secure
All Aleka websites (including captionpro.com.au) now use https protocol to preserve user security.
New Releases -SetTags 3.1.84 and OutlookTag 1.0.0.19
SetTags v 3.1.84 includes the ability to add tags to Outlook Contacts via the new version of Outlook Tag Add-In (1.0.0.19). Other improvements have made to Workgroup operation.
Tagging in Minimal-Infrastructure Environments
Once upon a time, organizations needed a physical presence that people could visit in order to engage with it. With the rise of virtual interactions, the need for a physical presence has diminished, and web sites, which are far cheaper than offices, have become the means of engagement with clients. This change has benefited small organizations greatly but has made the task of collaboration between organization staff more difficult. It’s not possible to walk over to someone’s office or cubicle to discuss something, or to look through a ring-binder for an important document. Tagging documents and emails can help but with virtual infrastructure it can be complicated.
Continue reading “Tagging in Minimal-Infrastructure Environments”New version of Caption Pro (2.1.82)
A new version of Caption Pro is now available with the problem of the screen getting smaller after applying a caption fixed, and support for setting aspect ratios in non-English Windows versions.