![Google’s reward standards for reporting bugs in AI merchandise Google’s reward standards for reporting bugs in AI merchandise](https://geeks-news.com/wp-content/uploads/http://2.bp.blogspot.com/-7bZ5EziliZQ/VynIS9F7OAI/AAAAAAAASQ0/BJFntXCAntstZe6hQuo5KTrhi5Dyz9yHgCK4B/s1600/googlelogo_color_200x200.png)
[ad_1]
Class
Assault State of affairs
Steerage
Immediate Assaults: Crafting adversarial prompts that enable an adversary to affect the conduct of the mannequin, and therefore the output in ways in which weren’t supposed by the applying.
Immediate injections which might be invisible to victims and alter the state of the sufferer’s account or or any of their property.
In Scope
Immediate injections into any instruments wherein the response is used to make choices that straight have an effect on sufferer customers.
In Scope
Immediate or preamble extraction wherein a consumer is ready to extract the preliminary immediate used to prime the mannequin solely when delicate data is current within the extracted preamble.
In Scope
Utilizing a product to generate violative, deceptive, or factually incorrect content material in your personal session: e.g. ‘jailbreaks’. This contains ‘hallucinations’ and factually inaccurate responses. Google’s generative AI merchandise have already got a devoted reporting channel for a lot of these content material points.
Out of Scope
Coaching Knowledge Extraction: Assaults which might be capable of efficiently reconstruct verbatim coaching examples that include delicate data. Additionally referred to as membership inference.
Coaching knowledge extraction that reconstructs gadgets used within the coaching knowledge set that leak delicate, private data.
In Scope
Extraction that reconstructs nonsensitive/public data.
Out of Scope
Manipulating Fashions: An attacker capable of covertly change the conduct of a mannequin such that they’ll set off pre-defined adversarial behaviors.
Adversarial output or conduct that an attacker can reliably set off by way of particular enter in a mannequin owned and operated by Google (“backdoors”). Solely in-scope when a mannequin’s output is used to vary the state of a sufferer’s account or knowledge.
In Scope
Assaults wherein an attacker manipulates the coaching knowledge of the mannequin to affect the mannequin’s output in a sufferer’s session in accordance with the attacker’s choice. Solely in-scope when a mannequin’s output is used to vary the state of a sufferer’s account or knowledge.
In Scope
Adversarial Perturbation: Inputs which might be supplied to a mannequin that leads to a deterministic, however extremely sudden output from the mannequin.
Contexts wherein an adversary can reliably set off a misclassification in a safety management that may be abused for malicious use or adversarial acquire.
In Scope
Contexts wherein a mannequin’s incorrect output or classification doesn’t pose a compelling assault situation or possible path to Google or consumer hurt.
Out of Scope
Mannequin Theft / Exfiltration: AI fashions usually embrace delicate mental property, so we place a excessive precedence on defending these property. Exfiltration assaults enable attackers to steal particulars a few mannequin akin to its structure or weights.
Assaults wherein the precise structure or weights of a confidential/proprietary mannequin are extracted.
In Scope
Assaults wherein the structure and weights should not extracted exactly, or once they’re extracted from a non-confidential mannequin.
Out of Scope
Should you discover a flaw in an AI-powered instrument aside from what’s listed above, you may nonetheless submit, supplied that it meets the {qualifications} listed on our program web page.
A bug or conduct that clearly meets our {qualifications} for a sound safety or abuse difficulty.
In Scope
Utilizing an AI product to do one thing probably dangerous that’s already potential with different instruments. For instance, discovering a vulnerability in open supply software program (already potential utilizing publicly-available static evaluation instruments) and producing the reply to a dangerous query when the reply is already obtainable on-line.
Out of Scope
As per our program, points that we already learn about should not eligible for reward.
Out of Scope
Potential copyright points: findings wherein merchandise return content material showing to be copyright-protected. Google’s generative AI merchandise have already got a devoted reporting channel for a lot of these content material points.
Out of Scope
[ad_2]