Blog - Page 10 of 12 - Aleka Consulting

Caption Pro 2.2.89

Caption Pro 2.2.89 now has a facility to change the colour of the caption font and background for multiple files.

Caption Pro 2.2.88 Released

This release fixes a bug where the program does not restart automatically after license key entry, and adds a facility for embedding captions in a video using the Kapwing web application

A software manager, a hardware manager, and a marketing manager are driving to a meeting when a tyre blows. They get out of the car and look at the problem. The software manager says, “I can’t do anything about this – it’s a hardware problem.” The hardware manager says, “Maybe if we turned the car off and on again, it would fix itself.” The marketing manager says, “Hey, 75% of it is working – let’s ship it!”

The great thing about standards is that there are so many to choose from.

Standards are like toothbrushes – everyone has one but nobody wants to use anyone else’s.

Why Nicholas Carr is Wrong About the Cloud Computing Revolution

Nicholas Carr is a great writer (and blogger). His books thoroughly deserve to be best-sellers. His 2008 volume “The Big Switch”, subtitled “The Definitive Guide to the Cloud Computing Revolution” is prefaced by pages of enthusiastic reviews. Carr’s thesis is that cloud computing will become a utility in the same way as electricity – centrally provided, cheap and transformative for society. He draws on the history of the electricity industry in the US in the late 19^th century, predicting the appearance of the cloud-based “World Wide Computer” will make using computer applications as simple as plugging an appliance into a power socket. In a 2013 Afterword he observes a change in the attitude of the business world to cloud computing from “sneering skepticism” to “bubbly enthusiasm”, and quotes a McKinsey report estimating the cost of buying a new server to be three times the price of obtaining the same computing capacity remotely.

Vendors

He also notes the foundering of once-buoyant hardware businesses of major vendors, and equates the proliferation of mobile devices such as phones and tablets with electrical appliances which proliferated after the building of the power grid, observing that they draw most of their value from online data stores and services, and lose much of their utility if disconnected. Many mobile devices currently sold have no facility for connecting to local computing or storage devices: they only communicate wirelessly with the cloud. One less socket is also one less thing to go wrong.

So how could he be wrong?

Carr blurs the distinction between cloud storage and running cloud applications. Cloud data storage is relatively straightforward to implement and the massive data storage volumes available at data centres provide enormous convenience, notably for the storing image and video data collected in vast quantities by mobile phones. High-bandwidth connections also facilitate social media connection between mobile devices. However, the applications using the cloud run on local hardware – including the Web browsers – and installation of apps on mobile devices has been streamlined to a one-click operation on selections made from an App store. The fact that an application is installed via an Internet download does not make it a cloud application, which can be run from any Web browser, rather than just downloaded.

Web applications

Web applications, which can be run from a Web browser, certainly exist and have become much more sophisticated, driving development of the HTML standard to include many more features. However, this interface complexity means that a massive effort is required to build a Web application that will run successfully on the multiplicity of Web browsers currently available. This difficulty means that only a few well-resourced applications will be able to be run in this way. Microsoft have addressed this problem in Office 365 by providing a desktop application, priced on a subscription rather than purchase basis, which runs using local copies of files that are synchronized with a cloud repository. The Web application part of Office 365 is much less capable. The experience of Web applications only running with a specific Web browser (or even a specific version of a Web browser) is distressingly common, and features such as ad and pop-up blockers, and cookie or Javascript disabling can also affect Web application operation.

will web applications be like appliances?

As businesses and individuals typically use dozens of applications, it seems unlikely that all of these will be available as Web applications, even if they use data stored in the Cloud or services available there. This means that cloud applications are not likely to be nearly as pervasive as electrically powered appliances which only need to be plugged into a power socket. The prevalence of DC powered devices means that differences in voltage and frequency of AC electricity supply (varying most dramatically between the 110 volt USA and the 240/50 volt Europe) are easily accommodated by most modern power supplies, allowing the same device and power supply to be used in most countries. The computing analog to this situation, where any software can be run efficiently on any device, seems very unlikely to come to pass. Emulators give some capability for running software on non-native operating systems, but they usually require considerable computer expertise to install, and may not offer complete functionality or expected levels of performance. It seems as though the transformation of human activity brought about by the provision of electricity as a utility won’t be brought about by cloud computing

The Future

However, this doesn’t mean that access to data and services via the Internet hasn’t dramatically changed things and won’t continue to do so. Access to remote server web services means that colossal computing power can be brought to bear on any problem very easily – the best example being Web search. It’s estimated that Google have about 1 million cores in its server farms around the world, many of which are involved in Web searches from any device anywhere in the world. Facial recognition is a problem deemed computationally infeasible 20 years ago. Now, it’s a commodity. The power of multi-level neural networks is being applied to Artificial Intelligence problems in many domains in a similar way, using massive remote resources. So in the future, you can expect to do things at home at little or no cost, that are currently only possible within well-equipped research labs. Like electricity, you’ll only notice Internet connectivity when it’s not there, but don’t expect a revolution.

Caption Pro 2.2.87 released

This version corrects a bug where captioned items were skipped in a slideshow of only captioned items.

What can Automatic Classification Bring to Records Management?

Records management is an unglamorous but essential part of the operation of any organization. The computerization of the workplace brought about some paradoxical changes (see Computers and the Death of Recordkeeping) and the rise to prominence of Artificial Intelligence has resulted in renewed interest in automatic classification replacing the historical role of the filing clerk.

Google and Organizational Search

The spectacular success of Web search engines such as Google in retrieving desired information naturally makes people wish they could replicate its success within organizations that they work for. Google realized this too and the history of the Google Search Appliance (which reaches its official end of life in 2019, after 17 years) illustrates their ultimately unsuccessful attempt to provide this. The basic reason for this is that there are seldom any hyperlinks to use for ranking search results within organizations. Ranking tends to be by keyword frequency , which gives poor results. No other search vendor has done any better, but a number of specialist Records Management companies are tackling this problem with automatic classification – assigning documents to file plan categories using automated methods. As paper records management gives way to digital, the problems of effective management of organizational records become more obvious and vendors are jumping into the opportunity.

BORN Digital

One problem at least is becoming less serious – as born-digital documents extend further into the past, the task of extracting machine-readable text from scanned document images is diminishing, and technical improvements in optical character recognition mean that extracted text from scanned documents is more likely to represent what was written than in the past. However, the task of grouping and classifying documents in the manner in which filing clerks used to operate has not seen a similar improvement.

Automatic Classification

There are two main methods of automatic classification: rule-based and training set based. Rule-based classification has a long history (back to the 1970s) and is simple to apply and understand. The occurrence of a word or phrase in a document, or words or phrases in proximity to each other is sufficient to indicate membership of a particular class. In the simplest case, a single occurrence of word or phrase is sufficient to place a document into a particular class. This approach is used by products intended for tagging rather than records management. Variants of this approach use statistical rather than binary classification, and may perform deterministic processing of text (such as word stemming) to improve performance. The task of defining a set of rules may be purely manual or may use a training set of already classified to identify words and phrases occurring more commonly in a particular class to generate rules semi-automatically.

Why AI can’t help much within organisations

Modern Artificial Intelligence classification may use neural nets to perform classification, treating words and phrases simply as tokens in the same way as features extracted from images. Such methods require large training sets and the rules generated are unknowable, and cannot be easily edited. The token –based analysis means that if words in text content are randomly re-ordered, turning the document into gibberish, the classification result remains unchanged, except for effects caused by the disruption of phrases. As more documents are classified, and the classifications approved, the number and quality of rules may increase, allowing vendors to claim that that their systems learn from experience. However, the lack of transparency of the rules used by networks means that the content of documents unrelated to its meaning (such as an organization’s address) may end up forming the basis of a classification rule.

Non-explicit content

Another problem with content-based classification is that the most important classification features of documents created for a small audience are not explicitly mentioned, as anyone viewing the document was assumed by its creator to know what it was about. A specific example from one company was a document relating to problems encountered with an Oracle database upgrade. The content never mentioned the word “Oracle” and only referred once to the version number of the database. The document name and provenance often are a much better guide to classification than the text content.

If google translate is so good, why can’t it classify my documents?

The increasing power of computers, and particularly the availability of massive cloud resources, masks the fact that automatic classification (and machine translation between languages) is based on statistical analysis of words and phrases, not on their meaning, as extracted by any human reader. The nuances of language are such that even the detection of negation in a text is a PhD topic, like deciding whether the sentence “Time flies like an arrow” is a statement about a type of fly or about human experience. The spectacular success of Artificial Intelligence in playing rule-based board games such as chess and Go does not involve any automated understanding of the complexities of language use, and its success in quiz games is based more on good database design than anything else. Reputedly, when IBM’s Watson was asked “Who was the first woman in space?” its answer was “Wonder Woman”, as it had no concept of the distinction between fiction and non-fiction.

Notwithstanding the limitations of statistical classification, any type of classification is usually better than none, even with a high error rate, but expectations should be tempered. The problem of language understanding may well be cracked in the future and automated filing clerks will become available, but we are certainly not at this stage yet.

Why it’s Difficult to Caption Videos

Text superimposed on video content is everywhere: YouTube even offers to caption videos by transcribing the audio to text and write it synchronously over the video. Most video sources (including DVDs) have an option to show subtitles any one of a number of languages. You can find the distinction between subtitles and closed captioning here. Closed captions are optional – you can view them or not, whereas subtitles are part of the video.

Videos Have More information Than Images

If text over video is so common, why is it difficult to do? The short answer is that videos contain much more information than images. Placing text in a video requires re-creation of all the content. This requires much more computation than re-creating a single image. However, showing a caption only requires storing the text, the time it is to be shown and the location on the screen. This information can be easily encoded in a file and rendered by the video player software. This is how most text on video is displayed. One problem is that different video players use different file formats for the data. Another is that the file is separate from the video file. If you download a video that has been auto-captioned from YouTube using a 3^rd party application any auto-captions will not be included.

Web-Based Facilities

Web-based facilities for displaying remotely stored video files, such as YouTube, can ensure that all videos are displayed using a video player that supports the display of separately stored captions. However, the separation of the captions from the video file means that the caption data is not easily available, or not available at all. YouTube has a takeout facility. This allows users to download all their YouTube video content, including a JSON metadata file for each video file. This file includes many metadata fields for videos but does not include the caption data.

Using a separate file for text and video is great for flexibility, but does require that the file be kept along with the video content. A further problem is that not all video players support all the available caption file formats. Perhaps some future video format will allow incorporation of text captions as metadata of the main video file. However, future video players must be able to read it! The default Windows 10 video player, Photos, supports a number of caption file formats and there are many online facilities for generating them, some of which are reviewed here.

Making Captions readable on any Player

So if you want to ensure that you can caption videos so that the captions are readable on any video player and are embedded in the video data, what are the options? If you want to keep the entire original video frame and place the caption beneath it, then the video needs to be padded out with a uniform colour bar below the video frame. This can be done using the Windows command line application ffmpeg. A complexity of this operation is that portrait mode videos from smart phones may be padded with black at the sides to make the video frame the same dimensions as in landscape mode.

Copying from Analogue Media

Videos copied from analog media such as Hi 8 or VHS cassette tapes may have similar padding added. The caption can then be written on the uniform colour bar or on top of the video, either using a web-based service such as Kapwing Subtitler or a desktop video editor such as Photos for Windows 10. Photos does not offer the flexibility of caption font, colour, and position selection offered by Kapwing Subtitler, but it is simple to use and available as part of Windows 10. Using a desktop application is likely to be much slower than using a cloud-based facility. These can apply more computing resources than are available on the average domestic computer to the task of reading video frames, adding text and re-encoding the video.

Caption Pro 2.2.86 Released

This release of Caption Pro extends captioning capabilities to include videos as well as images, and gives you the capability to present a sequence of images and videos on a Windows desktop in the same way that you do on a mobile phone, and with captions below each image and video to keep all of your precious pixels intact

Caption Pro Web Updated

Caption Pro Web has had a major makeover after reviewing how users have been interacting with it. The interface now support loading by dragging files or pasting and is now a lot simpler (I hope). Captioned images can be downloaded by right-clicking on the image as well as using the Download button. And it’s free to use – no registration needed.

Caption Pro 2.1.85 Released

This new version of Caption Pro lets you load files by drag and drop, and paste individual images into the program. It also lets you sort loaded files by file name, Modified date and Date Taken (usually but not always the same) and re-order collections of images manually so that you can group your photos in any way and integrate photos from different digital devices.

Notes:
1) If you check Start Program on the last screen of the installer, Caption Pro will be run from an Administrator account and you will not be able to drag and drop files onto it. If you run it after installation from the desktop icon or the Aleka Consulting->Caption Pro program entry, drag and drop will work OK.

2) Caption Pro 2.1.85 does not run correctly on Windows 7. This problem is under investigation.