Checkr swaps GPT-4 for a compact genAI model, simplifies background checks

Adjusting the model itself [on Predibase] was incredibly straightforward. Tuning the parameters was a breeze, as we simply had to select the appropriate model.
I noticed that not all models address the issues with equal precision.

[…Keep reading]

Adjusting the model itself [on Predibase] was incredibly straightforward. Tuning the parameters was a breeze, as we simply had to select the appropriate model.

I noticed that not all models address the issues with equal precision. We fine-tuned with Llama-3, yet we are consistently experimenting with various models to improve performance and achieve better alignment with our training dataset.

Despite utilizing small, finely-tuned models, users have reported encountering issues such as inaccuracies and distortions. How did you manage to overcome these challenges? Bukhin: Absolutely. It frequently generates inaccuracies. Fortunately, in cases of classification problems, we have the option of 230 potential responses. Strangely enough, it sometimes generates responses that are not part of that set of 230 potential [trained] responses. It’s simple for me to verify and dismiss them before reattempting.

It’s a matter of basic programming logic. This does not pertain to the compact language model. Here, we are addressing a very specific issue: evaluating text.

However, this is not the only approach taken to solve the overall problem. There is a contingency plan in place… so, if certain models prove ineffective, we transition to deep learning and subsequently an LLM. There is a significant amount of logic intertwined with LLMs. This logic can act as safety nets. It’s not solely reliant on the model. It incorporates programmatic logic.

Therefore, we did not encounter significant challenges with data cleaning for this endeavor, despite the possibility that it may be necessary in the future due to the generation of substantial unstructured data that has yet to be cleaned. The majority of the data cleaning work has been completed. Nevertheless, we could optimize certain cleaning processes using LLMs.

About Author

AndyC

Andy Curtis is an award-winning security consultant, researcher and public speaker. He has been working in the computer security industry since the early 1990s, having been employed by state and federal government, leading healthcare and banking providers across three continents. He has given talks about computer security for some of the world’s largest companies, worked with law enforcement agencies on investigations into hacking groups, and is a regular voice on TV and radio explaining IT security threats.

See author's posts