Model extraction is an attack where adversaries use repeated queries to recreate a deployed machine learning model. By analyzing outputs in response to crafted inputs, attackers can approximate the model’s parameters, decision boundaries, or architecture—effectively stealing the intellectual property encoded in the model.
This attack is also referred to as model cloning or reverse engineering. It is especially prevalent in public-facing APIs or subscription-based ML products, where attackers can gain black-box access and observe outputs at scale.
Motivations include:
Model extraction is feasible even with limited access if the attacker uses:
Defense strategies include:
Resources:
Our expert team can assess your needs, show you a live demo, and recommend a solution that will save you time and money.