TextRefine: A Novel approach to improve the accuracy of LLM Models

Ekta Dalal; Parvinder Singh

doi:10.56294/dm2024331

Original

Published: 2024-05-20

DOI: https://doi.org/10.56294/dm2024331

TextRefine: A Novel approach to improve the accuracy of LLM Models

Abstract

Natural Language Processing (NLP) is an interdisciplinary field that investigates the fascinating world of human language with the goal of creating computational models and algorithms that can comprehend, produce, and analyze natural language in a way that is similar to humans. LLMs still encounter issues with loud and unpolished input material despite their outstanding performance in natural language processing tasks. TextRefine offers a thorough pretreatment pipeline that refines and cleans the text data before using it in LLMs to overcome this problem . The pipeline includes a number of actions, such as removing social tags, normalizing whitespace, changing all lowercase letters to uppercase, removing stopwords, fixing Unicode issues, contraction unpacking, removing punctuation and accents, and text cleanup. These procedures work together to strengthen the integrity and quality of the input data, which will ultimately improve the efficiency and precision of LLMs. Extensive testing and comparisons with standard techniques show TextRefine's effectiveness with 99% of the accuracy.

Keywords:

TextRefine,

Natural Language Processing,

LLM Models,

How to Cite

Dalal E, Singh P. TextRefine: A Novel approach to improve the accuracy of LLM Models. Data and Metadata [Internet]. 2024 May 20 [cited 2024 Jul. 27];3:331. Available from: https://dm.saludcyt.ar/index.php/dm/article/view/331

Copyright Notice

The article is distributed under the Creative Commons Attribution 4.0 License. Unless otherwise stated, associated published material is distributed under the same licence.

Article metrics

Google scholar: See link

Metrics

Metrics Loading ...

Vol. 3 (2024)

See full issue

Revistas / Journals

Issue

About

Submissions

TextRefine: A Novel approach to improve the accuracy of LLM Models

Abstract

Keywords:

How to Cite

Copyright Notice

Article metrics

Metrics

Vol. 3 (2024)

Disclaimer