This post is also available in: heעברית (Hebrew)

Over the past few years, machine learning has gained popularity, and it is being integrated with more and more technologies and tools. A major factor in the popularity of Machine Learning has been the free distribution of data sets and training results. Organizations without large budgets to acquire data will find it very useful, but now a question arises: is the data accurate and trustworthy?

In data poisoning, or information poisoning, information is deliberately manipulated. Changes in the small amount of machine learning training data may be sufficient to cause serious problems, such as preventing detection and skewing the algorithm in a desired direction.

When integrating open source into an existing system, it is of utmost importance to test and verify the materials as they are being integrated. Hackers may use code poisoning to manipulate models that automate supply chains, steal information from organizations and companies, or reveal personal information. 

An official who specializes in machine learning at Microsoft told formtek.com that the field has not yet been practiced. According to him, machine learning systems are extremely costly, and if poisoned, the damages would be significant, both financially and in terms of the time it would take to locate the infected data. Despite the fact that practical machine learning solutions are still several years away, he argues, today we are forced to correct such errors manually, and retrain using good data.

Sounds interesting? Learn more at INNOTECH 2022 – The International Conference and Exhibition for Cyber, HLS, and Innovation by iHLS.