以下是关于 AI 诈骗防范的一些措施和技术手段:
在个人层面,要提高对 AI 诈骗的警惕性,不轻易相信来源不明的信息,学会识别可能的 AI 生成的虚假内容。
Require that developers of the most powerful AI systems share their safety test results and other critical information with the U.S.government.In accordance with the Defense Production Act,the Order will require that companies developing any foundation model that poses a serious risk to national security,national economic security,or national public health and safety must notify the federal government when training the model,and must share the results of all red-team safety tests.These measures will ensure AI systems are safe,secure,and trustworthy before companies make them public.Develop standards,tools,and tests to help ensure that AI systems are safe,secure,and trustworthy.The National Institute of Standards and Technology will set the rigorous standards for extensive red-team testing to ensure safety before public release.The Department of Homeland Security will apply those standards to critical infrastructure sectors and establish the AI Safety and Security Board.The Departments of Energy and Homeland Security will also address AI systems’ threats to critical infrastructure,as well as chemical,biological,radiological,nuclear,and cybersecurity risks.Together,these are the most significant actions ever taken by any government to advance the field of AI safety.Protect against the risks of using AI to engineer dangerous biological materials by developing strong new standards for biological synthesis screening.Agencies that fund life-science projects will establish these standards as a condition of federal funding,creating powerful incentives to ensure appropriate screening and manage risks potentially made worse by AI.Protect Americans from AI-enabled fraud and deception by establishing standards and best practices for detecting AI-generated content and authenticating official content.The Department of Commerce will develop guidance for content authentication and watermarking to clearly label AI-generated content.Federal agencies will use these tools to make it easy for Americans to know that the communications they receive from their government are authentic—and set an example for the private sector and governments around the world.
我们给一个具体的例子:ChatGPT的一个系统提示词.在它里面详细描述了它的整体的身份、角色、时间。他的这个记忆功能是怎么样去做的,它的DALLE绘图功能是怎么怎么做的,有哪些限制,怎么样调用的,它的上网功能,浏览网页的功能怎么调用的,function calling怎么做的,以及它的python代码等功能是怎么做的。第三种的话就是我们讲的提示词越狱。我们前面也讲了最经典的就是ChatGPT的这样的一个DAN模式。解禁它,让它可以说所有的脏话,讨论违法的这种问题,让它更像一个人,甚至能够让他做一些敏感内容。然后越狱一些常用的方式是什么样的?往往是一些角色扮演也好,或者说情境的模拟、任务的伪装、模式的重构等等。这方面也诞生了非常多经典的提示词,像DAN模式、越狱提示词、邪恶机器人以及ChatGPT的开发者模式,PPT中列了很多。还可以通过模式重构等方式实现越狱,时间有限我们就不详细展开。我们来简单的分析一下一个经典的越狱提示词——DAN,非常的狂野。这里面只展示部分,你可以看到他让我们的AI去干什么:可以胡说八道,可以尽情的说脏话,可以讨论非法话题,限制级的这种话题,可以去侵犯各种的隐私,规避各种的版权法等等。当你进行了这样的一些设定之后,你的AI就能突破许多限制,可以讨论许多话题。好了,以上介绍了各种的攻击的这种方法。接下来我们了解一下防御的话有哪些方式呢?我们把所有的这种AI系统,不管多复杂的这种AI系统进行一个简单的抽象,都可以抽象为这三部分。我们的提示词输入,然后给到我们的AI大模型,最后他给一个输出结果。因此在这个简单的这种抽象之上的话,我们可以把我们的防御措施也分为三个部分。
Q3:周总好,刚才您也提到每个人都可以利用AI工具提效,这是很好的一面。可能还有一面,有一些人也利用AI来诈骗,有一些不法分子。我们怎么去识别,或者怎么更好的去应对呢?周鸿祎:你说的实际上是人工智能安全的问题,这个问题非常好了,也是360要立志解决的一个问题,就是360必须要研究弄通大模型,措施解决大模型的安全问题。大模型的安全问题其实比较复杂,不能笼统而论,也不能一个很宏观地说叫硅基生物战胜碳基生物的安全问题,这种问题无解。所以,我们把问题分成三类: