How does Secure Redact use AI?
While AI solutions have come with many benefits - for example in medical diagnoses or drug development - their widespread proliferation has raised legitimate concerns, particularly in relation to privacy and security. For this reason among many others, privacy and security come built into Secure Redact on top of its state-of-the-art performance.
Throughout training and deployment, Secure Redact makes no use of generative or large language models nor does the training procedure depend on any web-scraping or any customer data.
Artificial Intelligence (AI) vs. Machine Learning
Artificial Intelligence (AI) is becoming increasingly more of a buzzword within the technology sector. As a result, many organizations also question how it can be used without compromising intrinsic human values - such as privacy, ethical considerations or unnecessary use of data. Machine Learning (ML), although a less well-known term, offers more clarity - particularly in terms of Pimloc’s Secure Redact technologies. To understand Secure Redact’s use of “AI” and therefore its application of data, it is important to understand the distinction between AI and ML.
Although often used across various technologies, the term ‘artificial intelligence (AI)’ is broad and therefore unclear - in part, perhaps, because of its descriptive name. AI is:
Not artificial - because its power to make connections or figure things out (“its inferential power”) does not derive from synthetic sources or a prefabricated process.
Not intelligent - because its inferential power does not originate from a detailed and exhaustive understanding of fundamental axioms that structure underlying conceptual or corporeal entities that manifest as observable phenomena.
Modern definitions of AI often allude to a collection of machine-based methods (often involving computers) that may simulate aspects of human learning. Given our incomplete understanding of the scientific processes that underlie human learning even today, this renders the term ambiguous and hard to use accurately and clarify.
The power of AI is leveraged from large data sets of naturally occurring observations, often of human origin, from the real world - such as from photographs or written sentences. They are used to develop sophisticated quantitative models, typically applying supervision-based learning rules to train deep network architectures. From this, some may deem various accuracy levels that approach and mimic human performance to confer “intelligence”. However, ‘artificial’ and ‘intelligent’ it is not.
By contrast, the term ‘machine learning (ML)’ is more unambiguously defined: the application of machine-based computational methods used to train inference models on data sets of observations. By eliminating any reference to human-like behavior and including a more specific scope, it is easier to apply this term more correctly.
Some companies, such as Google, Amazon, and Microsoft, suggest that ML is a subset of AI. However, this risks both oversimplification and misinterpretation.
A scatter distribution of 2D (xi, yi) point observations.
A conventional statistical method to establish whether x and y are associated is typically posed as a classification problem: two classes (1. ‘null’ means ‘uncorrelated’ and 2. ‘Test’ means ‘correlated’) are gauged simultaneously using moment correlations (such as Pearson’s or Spearman’s correlation coefficients) and evaluated by sampling from cumulative probabilistic distributions. A post-hoc straight line fit y = mx + c, where m is the gradient and c is the intercept, is performed to describe the nature of the correction posed as a regression problem: the application of a linear least-squares method to fit that straight-line.
While both correlation classification and linear regression constitute legitimate ML approaches, they would be considered excessively parochial to constitute an AI solution.
ML in Secure Redact
Currently, the centrepiece of ML technology in Secure Redact is the combined tracking and detection capability for accurate discrimination of human heads, licence plates, and other foreground object types. While third-party detection models have been released to the public and have often been trained on publicly available data, they are not suitable for Secure Redact for two reasons:
These models are insufficiently accurate for practical use beyond basic and trivial test cases.
Publicly available datasets tend to be focused on distinct image domains, such as still photography, whereas the target domain of Secure Redact is optimized for video data.
Consequently, the ML technology that performs object detection in Secure Redact:
Has been developed entirely in-house within Pimloc.
The models are bespoke in both architecture and training regimens.
The ML data used to train, validate, and test such models are all generated internally.
Using professional photographic equipment, video and image data is captured and stored on dedicated devices secured inside Pimloc premises.
Any necessary object foreground annotation is performed in-house by Pimloc.
No third-party is used for annotation, since close and continual collaboration with the Pimloc development team is required to ensure the data domain is suitably relevant and optimal, whilst maintaining an accuracy virtually at 100%. Such accuracy is mandatory, since a model of any complexity trained on data that is only 95% accurate will always be blighted by that ceiling in accuracy, and encumbered by a baseline error rate of 5% as an insoluble minimum.
Pimloc’s ML team: Discriminative deep learning for accurate object detection
Accurate detection models necessarily implement deep network architectures. Training is not generative but discriminative, since the image modality is confined to inputs rather than predicted as outputs of the network.
Similar to moment correlation with post-hoc linear least-squares seen in the example above, object detection combines classification with regression. Classification is required to distinguish image regions that comprise foreground and background elements, or in simpler terms, to recognise the ‘what’. Complementing this, regression is performed to fit box locations succinctly represented by their left, top, right, and bottom values to recognise the ‘where’.
Tracking methods are required to associate boxes in successive frames as belonging to the same foreground object - i.e. show what personal data relating to an individual to redact across a video. This problem is addressed using correlation - a simple and common statistical method used throughout historical research literature and across many scientific disciplines.
A similar case applies to the audio-to-speech recognition technology employed by Secure Redact, in order to produce transcripts.
Classification is used to classify phonemes of words from the sound waveform, and regression is required to establish when they occurred in relation to the media time frame. Finally, correlation is performed in order to attribute transcribed words to originate from individual speakers. As before, the procedure is discriminative rather than generative, since the audio modality constitutes only the input and never the predicted output of the relevant model architectures.
What Secure Redact does not use
While the bespoke ML technology that powers Secure Redact exhibits best-in-class performance, it is in no way comparable to AI ‘chat bots’. The key difference is how the training data is procured. Rather than being confined only to internally generated and manually annotated data, chat-bot developers perform ‘web-scraping’ to harvest their data sets. Web-scraping is the process of using software bots or ‘scrapers’ to extract data from websites automatically. Since it requires no human intervention, it can be performed at scale to acquire immense datasets.
Secure Redact does not make use of the following when processing customer data for redaction:
Generative models
Generative models are defined by their capability to model joint probability density functions over both inputs ‘and’ outputs. As a consequence, they are widely known for their ability to synthesize outputs typically from input media domains, such as audio, video, and text, with features that are similar to those occurring in the training data.
This contrasts from discriminative models that are confined to modelling the conditional probability density function of the outputs ‘given’ the inputs.
Since all Secure Redact ML solutions rely only on discriminative models that have no capacity to sample from the input domain space, they possess no such generative capability.
N.B. Secure Redact’s support/resource centre does include a simple text-based chat interface where users can ask questions about how to use various aspects of the Secure Redact product, this basic ‘AI assistant’ accesses a knowledge base which is made up from the product information about Secure Redact, from the Secure Redact website. The assistant uses this knowledge base to answer specific product related questions, and to provide links to the relevant content pages on the Secure Redact website for users to review.
Large language models (LLMs)
These are computationally intense generative models that output text. They have become very popular since the publication of the Transformer architecture and subsequent pre-training of its decoder subarchitecture to develop a Generative Pre-trained Transformer (GPT).
While a small ASR (audio-to-speech recognition) component is required within the ML pipeline of Secure Redact for transcription, it is inherently discriminative in design. It cannot generatively synthesise text, but is confined merely to inferring words from audio sounds.
Secure Redact, therefore, does not include any LLM technology.
Artificial General Intelligence (AGI)
Like AI, attempts to define AGI are based on a vague collection of machine-based methods, typically implemented using computers, that may simulate aspects of human learning. AGI proponents; however, usually weaken the definition to reduce the capability to ‘simulate many aspects of human learning’; for the purposes of more realistic aspiration, the term ‘many’ is preferred over ‘all’. Therefore, in addition to being neither ‘artificial’ nor ‘intelligent’, AGI is also not ‘general’. Still, AGI remains confined to its purely theoretical conception and a long way from any practical implementation.
