Structured Data Extraction Using Large Language Models Unmet Need: Method in Long-Form data extraction for LLMs Large Language Models (LLMs) excel in Natural Language Process (NLPs) tasks but face challenges in extracting information from large databases effectively. Current practices fail to address the inherent limitations of LLMs, resulting in suboptimal performance for long-form data extraction tasks. Researchers at Washington State University (WSU) have developed a nuanced, multi-step method to overcome these limitations. They have used a combination of vision-capable LLMs and a sliding window technique for data extraction. The application and optimization for handling long-form data extraction in LLMs represents an innovative and practical solution to a growing challenge in the field, which makes a valuable contribution to the evolution of NLP and LLM technologies. The Technology: Innovative Multi-step Method for Enhanced Long-Form Data Extraction in LLMs WSU Researchers introduced the sliding window method to overcome the limitations of LLMs in long-form data extraction. This method overcomes issues such as incomplete extractions due to complicated input data structures, better instruction following for complex extraction requirements, and small output context limitations. This approach allows for handling much larger datasets than previously possible with single-pass extraction methods, balancing cost, speed, and quality. Applications: Overcome limitations such as the limited output window of LLMs and logical errors in generating extensive text Facilitates efficient large-scale data mining Extracts valuable insights from extensive medical records and research papers Streamlines the analysis of lengthy legal documents Accelerates the extraction of key findings from vast scientific research data Advantages: Enhanced scalability and accuracy Maximize efficiency Cost-efficiency Flexibility and optimized performance Patent Information: A provisional patent application has been filed. Learn More Punam Dalai Technology Licensing Associate Washington State University (509) 335-1216 punam.dalai@wsu.edu Reference No: Software-25/3609 Bookmark this page Download as PDF Inventors Xiaofeng Guo Haydn Anderson Juejing Liu Noah Waxman Key Words Data mining Natural Language Processing