Checkr swaps GPT-4 for a compact genAI model, simplifies background checks

Adjusting the model itself [on Predibase] was incredibly straightforward. Tuning the parameters was a breeze, as we simply had to select the appropriate model.
I noticed that not all models address the issues with equal precision.

[…Keep reading]

Checkr ditches GPT-4 for a smaller genAI model, streamlines background checks

Adjusting the model itself [on Predibase] was incredibly straightforward. Tuning the parameters was a breeze, as we simply had to select the appropriate model.

I noticed that not all models address the issues with equal precision. We fine-tuned with Llama-3, yet we are consistently experimenting with various models to improve performance and achieve better alignment with our training dataset.

Despite utilizing small, finely-tuned models, users have reported encountering issues such as inaccuracies and distortions. How did you manage to overcome these challenges? Bukhin: Absolutely. It frequently generates inaccuracies. Fortunately, in cases of classification problems, we have the option of 230 potential responses. Strangely enough, it sometimes generates responses that are not part of that set of 230 potential [trained] responses. It’s simple for me to verify and dismiss them before reattempting.

It’s a matter of basic programming logic. This does not pertain to the compact language model. Here, we are addressing a very specific issue: evaluating text.

However, this is not the only approach taken to solve the overall problem. There is a contingency plan in place… so, if certain models prove ineffective, we transition to deep learning and subsequently an LLM. There is a significant amount of logic intertwined with LLMs. This logic can act as safety nets. It’s not solely reliant on the model. It incorporates programmatic logic.

Therefore, we did not encounter significant challenges with data cleaning for this endeavor, despite the possibility that it may be necessary in the future due to the generation of substantial unstructured data that has yet to be cleaned. The majority of the data cleaning work has been completed. Nevertheless, we could optimize certain cleaning processes using LLMs.

About Author

Subscribe To InfoSec Today News

You have successfully subscribed to the newsletter

There was an error while trying to send your request. Please try again.

World Wide Crypto will use the information you provide on this form to be in touch with you and to provide updates and marketing.