QUICK REVIEW

[论文解读] Geotechnical Parrot Tales (GPT): Harnessing Large Language Models in geotechnical engineering

Krishna Kumar|arXiv (Cornell University)|Apr 4, 2023

Topic Modeling被引用 8

一句话总结

论文研究大型语言模型如何通过提示工程、上下文感知查询、推理提示以及提出的行动-观测-思维工作流（结合工程工具）来减少幻觉并提高可靠性，从而协助岩土工程。

ABSTRACT

The widespread adoption of large language models (LLMs), such as OpenAI's ChatGPT, could revolutionize various industries, including geotechnical engineering. However, GPT models can sometimes generate plausible-sounding but false outputs, leading to hallucinations. In this article, we discuss the importance of prompt engineering in mitigating these risks and harnessing the full potential of GPT for geotechnical applications. We explore the challenges and pitfalls associated with LLMs and highlight the role of context in ensuring accurate and valuable responses. Furthermore, we examine the development of context-specific search engines and the potential of LLMs to become a natural interface for complex tasks, such as data analysis and design. We also develop a unified interface using natural language to handle complex geotechnical engineering tasks and data analysis. By integrating GPT into geotechnical engineering workflows, professionals can streamline their work and develop sustainable and resilient infrastructure systems for the future.

研究动机与目标

展示 ChatGPT/LLMs 在岩土工程任务中的潜力与局限性。
强调提示工程在缓解幻觉与不对齐中的重要性。
提出情境特定与工具辅助的工作流，以提升工程分析的可靠性。

提出的方法

解释基于 Transformer 的文本生成，以及注意力、温度和概率词选择如何影响输出。
将提示工程描述为引导 LLMs 以实现任务特定结果的手段。
提出使用向量嵌入和 FAISS 的情境特定搜索引擎，为 GPT 提供相关的 DIGGS 背景信息。
讨论使用链式思维提示进行推理及其在工程任务中的局限性。
引入带有短期记忆与长期记忆工具的行动-观测-思维（ReAct）风格框架，以实现复杂工作流程。
说明一个工作流，在该工作流中 GPT 使用专用工具（如 BearingCapacityTool、SoilReportTool）来执行岩土计算。

Figure 1: GPT text generation using transformers.

实验结果

研究问题

RQ1提示工程如何缓解幻觉、提高地质工程任务中 LLMs 的可靠性？
RQ2情境特定的搜索与工具支持框架能否使 GPT 执行可靠的岩土分析？
RQ3以推理与行动框架来处理具有挑战性的工程工作流程对 LLMs 的潜力如何？
RQ4如何将长、短期记忆整合以支持迭代的岩土计算？

主要发现

LLMs 容易出现幻觉和对齐问题；需要谨慎的提示设计与上下文基础。
一个情境特定的语义搜索工作流可以为 GPT 提供精确的领域背景（例如 DIGGS 的 XML 标签），以减少输出错误。
推理提示（链式思维）可以展示解决问题的步骤，但在没有适当约束时仍可能产生错误结果。
带有工程工具的行动-观测-思维（ReAct）方法可以实现结构化工作流并提高计算可靠性。
将短期记忆与长期记忆与 LLM 工具使用相结合，可以帮助更正确地解决复杂岩土任务，如所提出的最大荷载示例所示。

Figure 2: ChatGPT response to write a Python code for soil classification.

更好的研究，从现在开始

从论文设计到论文写作，大幅缩短您的研究时间。

无需绑定信用卡

本解读由 AI 生成，并经人工编辑审核。