Thanks to all the people who have purchased licenses since the end of January – bushfire relief organisations have received over A$160 from these purchases.
Caption Pro 2.2.91 is now available. This release fixes a problem where the applied image caption was slightly smaller than the region selected for the caption and changes the licensing to apply a 2 CPU per license limit.
The recent fires in Australia damaged relatively few properties but made a lot of people think about what they wanted to take with them if they had to evacuate.
Family photos are usually included in the material to be moved out, often as a box of paper prints. Digital storage of scans of these precious items is attractive, as images can be stored in the cloud by free services such as Google Photos or in paid cloud storage. Images can be accessed from any device anywhere in the Internet-connected world, and most importantly, are unaffected by local fires.
However, whilst it is straightforward to create a digital scan (or even a photograph) of family photos or a scrapbook, paper prints often have important information such as the date, and names of people in the photo written on the back of them.
If you want to integrate the information on the back of a print with the image on the front, Caption Pro provides a very simple way of doing this to make sure that all of your family heritage can use the advantages of digital storage.
Caption Pro v 2.2.90 is now available. This version allows auto captioning from titles applied by Picasa and corrects a bug indicating that Picasa tags appear in the caption rather than the sub-caption.
Caption Pro 2.2.89 now has a facility to change the colour of the caption font and background for multiple files.
This release fixes a bug where the program does not restart automatically after license key entry, and adds a facility for embedding captions in a video using the Kapwing web application
A software manager, a hardware manager, and a marketing manager are driving to a meeting when a tyre blows. They get out of the car and look at the problem. The software manager says, “I can’t do anything about this – it’s a hardware problem.” The hardware manager says, “Maybe if we turned the car off and on again, it would fix itself.” The marketing manager says, “Hey, 75% of it is working – let’s ship it!”
The great thing about standards is that there are so many to choose from.
Standards are like toothbrushes – everyone has one but nobody wants to use anyone else’s.
Nicholas Carr is a great writer (and blogger). His books thoroughly deserve to be best-sellers. His 2008 volume “The Big Switch”, subtitled “The Definitive Guide to the Cloud Computing Revolution” is prefaced by pages of enthusiastic reviews. Carr’s thesis is that cloud computing will become a utility in the same way as electricity – centrally provided, cheap and transformative for society. He draws on the history of the electricity industry in the US in the late 19th century, predicting the appearance of the cloud-based “World Wide Computer” will make using computer applications as simple as plugging an appliance into a power socket. In a 2013 Afterword he observes a change in the attitude of the business world to cloud computing from “sneering skepticism” to “bubbly enthusiasm”, and quotes a McKinsey report estimating the cost of buying a new server to be three times the price of obtaining the same computing capacity remotely.
He also notes the foundering of once-buoyant hardware businesses of major vendors, and equates the proliferation of mobile devices such as phones and tablets with electrical appliances which proliferated after the building of the power grid, observing that they draw most of their value from online data stores and services, and lose much of their utility if disconnected. Many mobile devices currently sold have no facility for connecting to local computing or storage devices: they only communicate wirelessly with the cloud. One less socket is also one less thing to go wrong.
So how could he be wrong?
Carr blurs the distinction between cloud storage and running cloud applications. Cloud data storage is relatively straightforward to implement and the massive data storage volumes available at data centres provide enormous convenience, notably for the storing image and video data collected in vast quantities by mobile phones. High-bandwidth connections also facilitate social media connection between mobile devices. However, the applications using the cloud run on local hardware – including the Web browsers – and installation of apps on mobile devices has been streamlined to a one-click operation on selections made from an App store. The fact that an application is installed via an Internet download does not make it a cloud application, which can be run from any Web browser, rather than just downloaded.
will web applications be like appliances?
As businesses and individuals typically use dozens of applications, it seems unlikely that all of these will be available as Web applications, even if they use data stored in the Cloud or services available there. This means that cloud applications are not likely to be nearly as pervasive as electrically powered appliances which only need to be plugged into a power socket. The prevalence of DC powered devices means that differences in voltage and frequency of AC electricity supply (varying most dramatically between the 110 volt USA and the 240/50 volt Europe) are easily accommodated by most modern power supplies, allowing the same device and power supply to be used in most countries. The computing analog to this situation, where any software can be run efficiently on any device, seems very unlikely to come to pass. Emulators give some capability for running software on non-native operating systems, but they usually require considerable computer expertise to install, and may not offer complete functionality or expected levels of performance. It seems as though the transformation of human activity brought about by the provision of electricity as a utility won’t be brought about by cloud computing
However, this doesn’t mean that access to data and services via the Internet hasn’t dramatically changed things and won’t continue to do so. Access to remote server web services means that colossal computing power can be brought to bear on any problem very easily – the best example being Web search. It’s estimated that Google have about 1 million cores in its server farms around the world, many of which are involved in Web searches from any device anywhere in the world. Facial recognition is a problem deemed computationally infeasible 20 years ago. Now, it’s a commodity. The power of multi-level neural networks is being applied to Artificial Intelligence problems in many domains in a similar way, using massive remote resources. So in the future, you can expect to do things at home at little or no cost, that are currently only possible within well-equipped research labs. Like electricity, you’ll only notice Internet connectivity when it’s not there, but don’t expect a revolution.
This version corrects a bug where captioned items were skipped in a slideshow of only captioned items.
Records management is an unglamorous but essential part of the operation of any organization. The computerization of the workplace brought about some paradoxical changes (see Computers and the Death of Recordkeeping) and the rise to prominence of Artificial Intelligence has resulted in renewed interest in automatic classification replacing the historical role of the filing clerk.
Google and Organizational Search
The spectacular success of Web search engines such as Google in retrieving desired information naturally makes people wish they could replicate its success within organizations that they work for. Google realized this too and the history of the Google Search Appliance (which reaches its official end of life in 2019, after 17 years) illustrates their ultimately unsuccessful attempt to provide this. The basic reason for this is that there are seldom any hyperlinks to use for ranking search results within organizations. Ranking tends to be by keyword frequency , which gives poor results. No other search vendor has done any better, but a number of specialist Records Management companies are tackling this problem with automatic classification – assigning documents to file plan categories using automated methods. As paper records management gives way to digital, the problems of effective management of organizational records become more obvious and vendors are jumping into the opportunity.
One problem at least is becoming less serious – as born-digital documents extend further into the past, the task of extracting machine-readable text from scanned document images is diminishing, and technical improvements in optical character recognition mean that extracted text from scanned documents is more likely to represent what was written than in the past. However, the task of grouping and classifying documents in the manner in which filing clerks used to operate has not seen a similar improvement.
There are two main methods of automatic classification: rule-based and training set based. Rule-based classification has a long history (back to the 1970s) and is simple to apply and understand. The occurrence of a word or phrase in a document, or words or phrases in proximity to each other is sufficient to indicate membership of a particular class. In the simplest case, a single occurrence of word or phrase is sufficient to place a document into a particular class. This approach is used by products intended for tagging rather than records management. Variants of this approach use statistical rather than binary classification, and may perform deterministic processing of text (such as word stemming) to improve performance. The task of defining a set of rules may be purely manual or may use a training set of already classified to identify words and phrases occurring more commonly in a particular class to generate rules semi-automatically.
Why AI can’t help much within organisations
Modern Artificial Intelligence classification may use neural nets to perform classification, treating words and phrases simply as tokens in the same way as features extracted from images. Such methods require large training sets and the rules generated are unknowable, and cannot be easily edited. The token –based analysis means that if words in text content are randomly re-ordered, turning the document into gibberish, the classification result remains unchanged, except for effects caused by the disruption of phrases. As more documents are classified, and the classifications approved, the number and quality of rules may increase, allowing vendors to claim that that their systems learn from experience. However, the lack of transparency of the rules used by networks means that the content of documents unrelated to its meaning (such as an organization’s address) may end up forming the basis of a classification rule.
Another problem with content-based classification is that the most important classification features of documents created for a small audience are not explicitly mentioned, as anyone viewing the document was assumed by its creator to know what it was about. A specific example from one company was a document relating to problems encountered with an Oracle database upgrade. The content never mentioned the word “Oracle” and only referred once to the version number of the database. The document name and provenance often are a much better guide to classification than the text content.
If google translate is so good, why can’t it classify my documents?
The increasing power of computers, and particularly the availability of massive cloud resources, masks the fact that automatic classification (and machine translation between languages) is based on statistical analysis of words and phrases, not on their meaning, as extracted by any human reader. The nuances of language are such that even the detection of negation in a text is a PhD topic, like deciding whether the sentence “Time flies like an arrow” is a statement about a type of fly or about human experience. The spectacular success of Artificial Intelligence in playing rule-based board games such as chess and Go does not involve any automated understanding of the complexities of language use, and its success in quiz games is based more on good database design than anything else. Reputedly, when IBM’s Watson was asked “Who was the first woman in space?” its answer was “Wonder Woman”, as it had no concept of the distinction between fiction and non-fiction.
Notwithstanding the limitations of statistical classification, any type of classification is usually better than none, even with a high error rate, but expectations should be tempered. The problem of language understanding may well be cracked in the future and automated filing clerks will become available, but we are certainly not at this stage yet.