25. January 2023
Introduction
Digitization, automation, and connectivity continue to drive transformational change – and the automotive industry is no exception. As technology advances, the concept of self-driving vehicles is accelerating towards a reality. It is no longer a question of if, but when the roads of the future will be navigated by autonomous vehicles (AV).
Yet the AV development process generates, analyzes, and stores huge amounts of personal data in an era when privacy matters more than ever – as reflected by legislation such as the GDPR in Europe.
This report looks at why data is critical to the pursuit of innovation and why privacy standards threaten to slow down and even block this innovation, before explaining how Deep Natural Anonymization (DNAT) offers a new way forward.
Section 1
Video data and AV innovation
You may be surprised by just how much data autonomous vehicles generate, collect, analyze, and store. This data covers everything from owners and passengers to location and basic navigation to complex functions such as traffic management and local surroundings.
Automated object recognition through video plays an essential role in the development of autonomous vehicles. The imagery captured from every conceivable driving scenario is used to train AI systems to accurately recognize occupants, pedestrians, traffic signs, other road vehicles, etc. In this section we will look at the four main use cases of video technology:
- Advanced Driver Assistance Systems (ADAS) and Automated Driving Systems (ADS)
- Quality assurance (QA) & vehicle validation
- Certification
- Incident recording
Just one autonomous vehicle may generate around 4,000 GB (4 terabytes) of data every day. In the foreseeable future, autonomous vehicles would generate over 300 TB data each year in the US alone.
– Brian Krzanich, Intel CEO, In discussion at Automobility LA
What kind of information does an AV collect?
Owner and passenger data
Autonomous vehicles need to identify drivers and passengers for all kinds of reasons, from authorizing use to personalizing comfort, safety, and entertainment settings.
Location and navigation data
Autonomous vehicles collect an array of data for navigation, such as route information, speed, real-time traffic data and points-of-interest along the planned route.
Sensor data
Autonomous vehicles also use sensors, cameras, dash cams, radar, thermal imaging devices, and light detection and ranging (LiDAR) devices to collect data about the vehicle’s operation and its surroundings.
Section 2
The advent of ADAS
The World Health Organization estimates that up to a staggering 1.35 million lives are lost to road accidents every single year. With human error among the most common causes, rolling out artificial intelligence (AI) and machine learning is the logical next step in road safety. We commonly see this in the form of advanced driver-assistance systems (ADAS).
AI is already starting to shape the future of numerous aspects of modern life, from manufacturing industries to the medical sector. And the technology is nothing new to many car manufacturers, whose vehicles have long featured ADAS to help drivers park, reverse, overtake, stay in lane, and retain control of their vehicles in adverse weather conditions more safely.
To function as they should, ADAS typically rely on AI-powered cameras and sensors trained to identify other vehicles, potential hazards, pedestrians, and even the facial expressions of driver and passengers
Common examples of ADAS
ADAS systems have become familiar to drivers in a wide variety of standard and optional features.
- Adaptive cruise control
- Forward-collision warning
- Lane assist
- Overtaking assistance
- Pedestrian detection
- Parking Assist
- Driver drowsiness detection
The six levels of driver autonomy
The National Highway Traffic Safety Administration (NHTSA) defines six levels of driving autonomy. Currently, most AI-powered vehicles range from level 1 to 3 on the NHTSA scale.
The next milestone: autonomous vehicles (AVs)
ADAS has become a familiar part of the modern driving experience. The concept of fully autonomous vehicles has yet, however, to gain widespread acceptance among the public. Issues such as technological readiness, pricing, questions around safety and security, and regulation all need to be addressed. Yet AVs also offer huge potential to transform driving and offer tremendous value to owners: the ability to work, watch a movie, or catch up on social media will all become possible while on the move. According to McKinsey, a progressive scenario would see fully autonomous cars accounting for up to 15 percent of passenger vehicles sold worldwide in 2030.
McKinsey estimates that up to 15 percent of the new cars sold in 2030 could be fully autonomous.
According to Navigant Research, annual sales of vehicles with self-driving capabilities could reach 94.7 million by 2035.
(Source: Disruptive trends that will transform the auto industry, McKinsey, 2016)
Section 3
The role of video in AV development
Full vehicle autonomy effectively transforms a driver into a passenger by allowing AI to take over human decision-making from speed to steering. For example, automated visual inspection provides cameras with the capability to make sense of traffic signals and street signs, as well as to recognize vehicles and pedestrians. Or take lidar sensors as another example, which measure the distance between a vehicle and its surroundings and automatically brake or accelerate in the face of potential danger. Teaching these systems to fully understand and react to all possible driving scenarios requires vast amounts of video data, which is captured and processed in real-time by a host of sensors.
As yet, there is no consensus on just how much data intelligent vehicles will generate. Autonomous test vehicles typically generate between 5TB and 20TB of data per day.
– Mark Pastor, Archive Product Marketing Director, Quantum
3.1 Video and QA & vehicle validation
QA testing of ADS and ADAS functionalities— and of the vehicle as a whole—represents a crucial part of automotive production. Testing is performed to make sure driving systems work as expected, as well as to identify bugs, detect gaps that require a tweak to machine learning algorithms, complete internal processes, and confirm the car is ready to move on to the next stage of production.
Validation is typically performed on test vehicles first in closed areas and then in real-world situations. As a rule, the further a vehicle is in the production cycle, the more data will be collected. Video plays a key role not only as a reference source, but also in understanding any incident or interpreting data from other sources such as lidar or radar. Indeed, one single project can generate up to 10,000 hours of video data.
3.2 Video and certification
OEMs are required to navigate a sea of certifications to win the approval of relevant authorities and launch a new vehicle in a market. At the same time, they also need to make available a vast array of technical information, which is based on standard certification processes. In addition to WLTP for average gasoline consumption, a dizzying array of other certifications pertains to EURO norms, CO2, NOX, etc. Without this information, manufacturers are quite simply not allowed to sell cars.
The certification process not only needs to be conducted for each market or region, but also for each sub-model too. Video is generally used for documentation and as a reference for potential future investigations, yet the process generates a substantial volume of data.
3.3 Video and the production recording of incidents
OEMs continue to collect increasing volumes of valuable data, even after a vehicle is in the hands of the consumer. The more data they generate from more vehicles in a wider range of driving situations, the greater the insights they can generate to support R&D, drive continuous product updates, and optimize production processes. For example, recorded data allows a manufacturer to analyze incidents, evaluate the root cause, and take action to prevent such an incident from happening again.
This data also flows into vast ‘data lakes’ that OEMs can sell to insurance companies or other industries.
The two types of personal data
Primary data: recorded and collected inside the vehicle. This might be the kind of music the driver listens to, allowing AI to offer personalized recommendations. Or it may be preferred seat and comfort settings. Drivers will usually have consented to this in the terms and conditions
Secondary data: indirect data such as other vehicles, cyclists, or pedestrians recorded without their awareness or consent.
Section 4
Video data in the age of privacy
The development of autonomous vehicles necessitates the accumulation of vast amounts of data. At the same time, a growing framework of regulatory standards has evolved to protect the public from the use, misuse, and abuse of personal data.
Foremost among these regulations is the GDPR, drafted and passed by the EU yet applicable to any organization that collects the personal data of EU citizens and residents.
Article 7 of the GDPR states that organizations are obliged to gain the written consent of individuals (the ‘data subjects’) to process their personal data. The article also asserts that data subjects maintain the right to withdraw their consent at any time.
Failure to adhere to these standards can be costly, with the GDPR empowered to levy substantial fines on organizations that fail to meet its standards. These fines can reach up to 20 million euros or 4% of global revenue, whichever is higher. Subjects of any data breach also have the right to seek compensation for damages.
In 2022, fines handed out to organizations in breach of privacy standards had amounted to 832 million euros.
(Source: enforcementtracker.com)
4.1 GDPR: a tricky road to navigate
Since the GDPR was specifically developed to protect personal data, any information that can be used to identify an individual falls under its remit. That includes the data collected in the development of AVs, since much of this information makes it possible to identify drivers and their passengers in addition to other vehicles and passers-by.
Sensitive personal data includes biometric data such as voice or fingerprint recognition; behavioral data such as driving patterns, speed, and acceleration; and personally identifiable information such as faces, license plates, etc.
As we have already seen, the collection, use, and storage of data require the express written consent of subjects. Yet it is unfeasible to expect OEMs to track down every other driver, pedestrian or cyclist recorded as secondary data subjects in order to gain this consent.
Total fines for breaches of the GDPR exceeded one billion euros in summer 2021.
4.2 The cost of blocking innovation
We are faced with a situation in which data protection laws threaten to put the brakes on innovation, since much of this innovation is powered by video data. And with 84% of executives agreeing that innovation is essential for growth (source: McKinsey), the impact can ultimately affect profits and long-term success.
According to Bitkom, Germany’s digital association, over 75% of the 502 companies surveyed agreed that innovation projects have failed due to the legal obligations imposed by the GDPR. And 86% have halted projects due to uncertainties in dealing with the regulation.
The Booz & Co. (now Strategy&) 2011 Global Innovation 1000 report discovered that the most innovative organizations achieve 22% higher EBITDA growth than their less creative counterparts.
The areas most frequently affected include setting up data pools (54%), process optimization in customer service (37%), projects for improving data use (37%), and the use of new technologies like AI or big data (37%).
Catch-22
If OEMs continue with their innovation projects while remaining compliant with legislation, they face one of three choices:
Across all industries, digital technologies are the most important drivers of innovation. We need to better balance data protection and data use.
– Susanne Dehmel, Managing Director, Bitkom
The essential role of data transfer
According to a survey from Bitkom, Germany’s digital association, 12% of companies would fall behind in the global competition for innovation if it were no longer possible to process personal data outside the EU.
For example, international data transfers to non-EU countries play a major role in the German economy. Almost half of all companies exchange data with external service providers from non-EU countries, a quarter with business partners, and 12% with other company departments.
Section 5
Enter anonymization
Just as technology puts manufacturers at risk of breaching privacy standards, so technology provides an answer to this catch-22 situation. Anonymization techniques, such as blurring faces and license plates or pixelation, prevent data from being identified and ensure that it can be safely used in compliance with the GDPR and similar regulatory frameworks.
Yet this comes at a price. Conventional anonymization techniques are incapable of preserving the accuracy and integrity of the original data. This impact on data quality in turn compromises its compatibility with machine learning and analytics.
Deep Natural Anonymization, or DNAT, eliminates this trade-off. Based on generative AI, this unique technology creates synthetic faces and replica license plates that prevent the original subjects from being recognized.
This anonymization technique is much more valuable than simply blurring faces and license plates. Facial features and physical attributes can still be recognized, and data can be used to train machine learning models. DNAT combines technical innovation with effective protection of personal privacy, distinguishing it from other redaction techniques. Importantly, this approach ensures that video recordings remain compliant with the strict data protection guidelines stipulated by GDPR and other regulations.
– Philipp Wende, Senior Consultant Automotive & Innovation Program Lead, DXC
At the same time, DNAT preserves the quality and integrity of the original data and, in doing so, retains attributes such as age and gender to preserve semantic segmentation. This makes it the only anonymization technique capable of powering analytics and machine learning.
- DNAT is safe: re-identification by facial recognition technology is impossible, with synthetic faces randomly generated and non-reversible.
- DNAT is accurate: age, gender, race, emotions, facing direction, and intention are retained for analysis and AI development.
- DNAT is compliant: EuroPriSe certification for privacy-compliant IT products.
This allows OEMs to safely use videos and images to power machine learning, yet without the threat of receiving heavy fines or halting innovation.
How DNAT works
DNAT uses AI to automatically detect faces and other identifiable elements such as license plates in the original images and videos. The technology then randomly generates artificial replacements that reflect the original attributes.
For example, it is often important to preserve facial attributes such as gender, emotions, intent or age for further analytics. DNAT retains any information that does not contain sensitive personal data without modification. In doing so, it effectively removes the compromise between anonymizing data and retaining the original quality.
The technology then applies these nonreversible overlays to the original, ensuring that re-identification by facial recognition technology is impossible.
This technology makes data collection in public compliant according to privacy regulations worldwide, such as GDPR in Europe, CSL in China and the upcoming CCPA in the US
– The Washington Post, March 21st, 2019
Case Study #1
brighter AI + SEGULA GDPR-compliant data transfer
SEGULA Technologies, an industry-leading engineering group, was tasked with certifying the ADAS for an OEM client located outside the EU. The process required numerous test drives, which produced 8,000 hours of video data containing personally identifiable information (PII) of pedestrians and passing vehicles.
The data had to be transferred from SEGULA’s data center in Germany to the OEM’s headquarters within 12 weeks. Yet the GDPR obliges companies to fully anonymize all personal data before transfer to a third country outside the EU/EEA.
Our anonymization software made it possible to anonymize this huge amount of high data within an exceptionally tight timeline, saving the significant time and cost of manual redaction.
With brighter AI, we successfully transferred a large amount of data to our client outside the EU under time pressure. By using brighter AI’s fully automated software, we saved thousands of hours of manual work and were able to provide our client with high-quality data for further processing. We valued the accuracy and adaptability of brighter AI’s anonymization solution and the support from the team.
– Daniel Scholz, Team Leader Electronics & Software Development, SEGULA Technologies Services GmbH
Case Study #2
brighter AI + CSI S.p.A. Data sharing outside the EU
Italy-based CSI Automotive offers various services from product development and validation to data acquisition and management software.
When CSI tested advanced driver-assistance systems for a Chinese automobile manufacturer, the main challenge lay in cross-border data transfers to mainland China. Vehicle cameras capture personal identifiers such as faces and license plates during the testing process, while the GDPR obliges companies to anonymize this kind of data.
CSI’s Engineering Department used our scalable cloud-based image and video anonymization software. It offered the best results for both license plates and face anonymization, while reducing hardware footprint and maintenance costs. This allowed CSI to share data, something that would not have been possible by other means like encryption.
According to our customer, the anonymization process is seen as a requirement to be compliant with EU regulation. For them, accuracy is important and for us, as a partner, both performance – in terms of speed – and stability of the solution is crucial. […] brighter AI’s service meets our requirements. They were really helpful, and we easily integrated brighter AI’s software into our workflow. In addition, the customer experience brighter AI provides is great, because we get dedicated support with prompt replies and comprehensive troubleshooting.
– Salvatore Musumeci, Powertrain and Vehicle Validation Manager, CSI S.p.A.
Case Study #3
brighter AI + DXC Scalable GDPRcompliance for automotive and beyond
US-based DXC Technology is a multinational provider of B2B IT services. To accelerate the development of autonomous vehicles, DXC has developed DXC Robotic Drive, a platform that supports leading OEMs in the collection, storage, and analysis of data.
With users processing hundreds of petabytes of data, it is essential to protect personal identifiable information in line with privacy regulations such as GDPR. Compliance, scalability, and speed are all important to DXC customers, so the company needed to integrate a robust and accurate automatic anonymization software directly onto the platform without compromising data quality.
brighter AI’s anonymization software has made it possible to protect data at scale. And thanks to the ease of integration, the two partners could move from initial discussions to integrated solution in just a few weeks.
This anonymization technique is much more valuable than simply blurring faces and license plates, because facial features and physical attributes can still be recognized, and that data can be used to train machine learning models. The solution combines technical innovation with effective protection of personal privacy, distinguishing it from other redaction techniques. Importantly, this approach ensures that video recordings are in compliance with the strict data protection guidelines stipulated by GDPR and other regulations.
– Philipp Wende, Senior Consultant Automotive & Innovation Program Lead, DXC
Case Study #4
brighter AI + OnREX Eliminating manual tasks
German company OnREX GmbH is the developer of DYNAREX, a mobile office solution for automotive appraisers. The software is used to create expert opinions on liability and damage to vehicles.
An average of 25 to 30 digital damage photos are processed per appraisal. Because all personally identifiable information (PII) must be anonymized before the photos are passed on to third parties, the company required an automated, scalable, robust PII anonymization software that also offered highly accurate license plate and face detection – even when objects are partly occluded or distorted.
OnREX chose our scalable, cloud-based image and video anonymization software. The option to automatically redact images inside the vehicle appraisal app frees DYNAREX users from performing the task, saving more than 100 hours of tedious manual work within the first year alone.
We decided to use the service of brighter AI because the functionality to anonymize license plates and persons works very well. We were particularly surprised by the accuracy of the recognition method. Before the integration, license plates on damage photos had to be manually masked by the vehicle appraiser. Our customers have been able to use the service for about 2 months now. They appreciate the speed, security and naturally executed anonymization on damaged photos. The API provided was well documented and could be quickly integrated into our cloud solution.
– Jens Dürasch, Managing Director, OnREX GmbH
Case Study #5
brighter AI + Valeo 10,000+ images anonymized
Valeo is a global automotive supplier providing a wide range of products to automakers and the aftermarket. The company requires vast amounts of image data for autonomous driving research and training neural networks and validation systems.
Their WoodScape dataset is the first extensive automotive fisheye dataset, consisting of images from four surround-view cameras collected across several countries. In the face of strict privacy regulations, Valeo needs to anonymize the data without impacting the unique value of WoodScape, which includes semantic segmentation annotation and ML and analytics compatibility.
A natural appearance and minimal pixel impact on the visual data are essential. Deep Natural Anonymization makes this possible. DNAT is camera-agnostic, so it works for any setting and format, including fisheye. Personally identifiable information is accurately detected and replaced, enabling annotation, analytics, and machine learning while remaining privacy-compliant.
Flexible deployment offers the freedom to anonymize on certified servers in the cloud or on-premise, where the data resides. Valeo chose the latter to retain full control over the environment.
WoodScape has publicly collected image data from several countries and there is a significant risk of violating privacy regulations. Anonymizing personally identifiable information like faces and license plates with traditional approaches like pixelating causes artifacts in the image and can have a significant negative impact on the quality of the trained model. To tackle the dilemma, we made use of brighter AI’s Deep Natural Anonymization.
– woodscape.valeo.com/dataset
Conclusion: A new way forwards
Video data has become an integral driver of innovation in the automotive industry. At the same time, increasing robust and expansive privacy laws place tight restrictions on how this data can be used – and even threaten its use altogether.
Conventional anonymization technology lacks the capability to preserve data quality, which in turn impacts the training of AI algorithms. Automotive companies would therefore face an impossible choice of breaching data laws or compromising the integrity of innovation.
DNAT offers a new direction. By ending the trade-off between privacy and video analytics, it empowers companies to innovate safely and responsibly
Section 5
Meet brighter AI
brighter AI’s anonymization solutions are designed to collect automotive data in full compliance with the latest privacy standards. In fact, DNAT is the only certified value-preserving video redaction software to guarantee full GDPR compliance. At the same time, it preserves the data quality of the original image to drive AI innovation and machine learning.
Supporting both current development projects as well as future vehicle fleet data collection, our software is ideal for training machine learning models such as autonomous driving without compromising data quality.
We use deep learning to recognize objects: artificial neural networks trained on large data sets including a range of resolutions and perspectives. This offers a higher degree of accuracy and robustness compared to conventional approaches. If you find this interesting, check out our report on the accuracy of machine learning models trained on anonymized data.
Approved by privacy professionals and research scientists, our anonymization software seamlessly integrates into any platform from edge to on-premise to cloud. And it is backed up by cloud compliance & data protection warranties, full support, and zero maintenance costs.
brighter AI’s solution was easily integrated and the natural anonymization was what we needed for improvement of line & detection validation strategy.
– Vaclav Schiybel, System validation platform manager, Valeo.
brighter AI has solved a fundamental problem of using and storing image and video data in compliance with data protection regulations.
– Handelsblatt, Nov. 23rd, 2019
DNAT automatically detects a personal identifier such as a face and generates a synthetic replacement, protecting identities while keeping necessary information for analytics or machine learning. brighter AI provides the world’s most advanced image and video redaction technology.
– Marian Gläser, CEO & Co-founder