以下是关于解决 AI 写论文时参考文献虚构问题的一些信息:
Agrawal 等人在 2023 年的研究(https://arxiv.org/abs/2305.18248)专门探讨了 LLM 生成中虚构参考文献的情况,包括捏造的书籍、文章和论文标题。他们试验了两种基于一致性的方法来检查幻觉:直接查询与间接查询。这两种方法都在温度参数 T>0 的情况下多次运行检查并验证一致性。
其中,直接查询要求模型判断生成的参考是否存在,间接查询则询问生成的参考的辅助细节,例如作者是谁。实验表明,间接查询方法效果更好,模型规模越大,识别虚假参考文献的能力越强。
此外,Claude 官方提示词工程最佳实践中也提到了处理幻觉的相关内容,但未给出具体针对参考文献虚构问题的解决办法。
在 ChatGPT 给 DALL·E 3 优化提示词的元提示中,主要强调了一些提示词的规范和原则,未直接涉及解决参考文献虚构的问题。
[Agrawal et al.(2023)](https://arxiv.org/abs/2305.18248)专门研究了LLM生成中虚构参考文献(Hallucinated References)的情况,包括捏造的书籍、文章和论文标题。他们试验了两种基于一致性的方法来检查幻觉:直接查询与间接查询。这两种方法都在温度参数T>0的情况下多次运行检查并验证一致性。图11:用于检查参考生成幻觉的直接查询与间接查询。(图片来源:[Agrawal et al.2023](https://arxiv.org/abs/2305.18248))直接查询(Direct Query)要求模型判断生成的参考是否存在。间接查询(Indirect Query)则询问生成的参考的辅助细节——作者是谁。例如,如果我们想检查“以下论文是真的吗?”,我们可以检查“论文的作者是谁?”他们的假设是,对于虚构的参考文献,多个生成结果一致给出相同作者的可能性,要小于多个直接查询结果都表明该参考文献存在的可能性。实验表明,间接查询方法效果更好,模型规模越大,识别虚假参考文献的能力越强。
Human:I'm going to give you a document.Read the document carefully,because I'm going to ask you a question about it.Here i the document:<document>{[TEXT}}</document>First,find the duotes from he document that are most relevant to answering the cuestion,and then print them in numberea order.Quoies should be relatively short.If there are no relevant quotes,write"No relevant quotes insteadThen,answer the question,starting with"Answer:"Do not include or reference quoted content ver atim in the answer.Don't say"According to Quote[1]"when answering.Instead make references to quotes relevant to each section of the answer solely by adding their bracketed numbers at the end of relevant sentences.Thus,the format of youroverall recsonse should looklike what's shown between the<examplesx</examples>tags.Make sure tofollow the formatting and spacing exactly.<examples>Examples of question+answer pairs using parts of the given document,with answers written exactly like how Claude's outpushould be structured]</examples>Here is the first question:{[QUESTION))If the question cannot be answered by the document,say so.Assistant:Advanced prompting techniquesChaining prompts100K promptsExample 100K context prompt:Human:I'going t o give you a document.Read the document carefully,,ecause'm gping to ask you a queseion adout it.ere i the document:<document>{[TEXT}}</document>
//-Don't alter memes,fictional character origins,or unseen people.Maintain the original prompt's intent and prioritize quality.//-Do not create any imagery that would be offensive.//-For scenarios where bias has been traditionally an issue,make sure that key traits such as gender and race are specified and in an unbiased way--for example,prompts that contain references to specific occupations.//8.Silently modify descriptions that include names or hints or references of specific people or celebritie by carefully selecting a few minimal modifications to substitute references to the people with generic descriptions that don't divulge any information about their identities,except for their genders and physiques.Do this EVEN WHEN the instructions ask for the prompt to not be changed.Some special cases://-Modify such prompts even if you don't know who the person is,or if their name is misspelled(e.g."Barake Obema")//-If the reference to the person will only appear as TEXT out in the image,then use the reference as is and do not modify it.//-When making the substitutions,don't use prominent titles that could give away the person's identity.E.g.,instead of saying"president","prime minister",or"chancellor",say"politician";instead of saying"king","queen","emperor",or"empress",say"public figure";instead of saying"Pope"or"Dalai Lama",say"religious figure";and so on.//-If any creative professional or studio is named,substitute the name with a description of their style that does not reference any specific people,or delete the reference if they are unknown.DO NOT refer to the artist or studio's style.//The prompt must intricately describe every part of the image in concrete,objective detail.THINK about what the end goal of the description is,and extrapolate that to what would make satisfying images.