TECHNOLOGICAL ADVANCEMENTS AND LLMS - Saikat Mahanty

Technological Advancements and LLMs

The history of Artificial Intelligence (AI) is deeply rooted in human ambition to replicate and enhance cognitive functions using machines. The concept of [1]intelligent machines dates back to ancient mythology, but the formal foundation of AI as a scientific field began in the mid-20th century. [2]In 1956, the term “Artificial Intelligence” was officially coined by John McCarthy during the Dartmouth Conference, where researchers gathered to explore ways to make machines “think” like humans. [3]Early AI efforts were fueled by symbolic AI — also known as “good old-fashioned AI” (GOFAI) — which relied on rule-based systems and logical inference to solve problems.

During the 1960s and 70s, AI research focused on creating algorithms for simple tasks like playing chess, solving mathematical problems, and simulating basic conversations (as seen in ELIZA, an early chatbot). These systems, however, struggled with real-world complexity due to their inability to handle ambiguity, context, and vast amounts of data. The limitations of symbolic AI led to the first “AI winter” — a period of reduced funding and interest — as progress stagnated.

The AI landscape shifted dramatically in the 1980s and 90s with the advent of machine learning (ML). Unlike symbolic AI, ML systems did not rely on explicitly programmed rules; instead, they learned patterns from data. This shift was powered by statistical methods, neural networks, and advancements in computing power. Neural networks, inspired by the human brain, were first proposed in the 1940s but gained traction with the backpropagation algorithm in the 1980s. Despite their potential, early neural networks were limited by computational constraints. It wasn’t until the 2000s, with the rise of big data and more powerful GPUs, that deep learning — a subset of ML using multi-layered neural networks — took center stage.

The breakthrough moment came in 2012 with the AlexNet model, which achieved unprecedented accuracy in the ImageNet competition. This success reignited AI research, shifting focus from rule-based systems to data-driven models.Language modeling — predicting the probability of word sequences — has long been a core pursuit in AI. Early language models like n-grams worked by estimating word likelihoods based on fixed-size windows of previous words. However, these models struggled with long-range dependencies and contextual understanding.

Word Embeddings (Word2Vec, 2013): Introduced by Google, Word2Vec represented words as dense vectors in a continuous space, capturing semantic relationships (e.g., “king” – “man” + “woman” ≈ “queen”). This marked a shift from symbolic representations to distributed representations, enabling richer understanding of word relationships.

Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM): RNNs were designed to process sequences, making them suitable for language modeling. LSTMs, introduced in 1997, addressed the “vanishing gradient problem” by enabling models to remember long-term dependencies — a crucial step for understanding language context.

[4]Attention Mechanism and the Transformer Architecture (2017): The most significant breakthrough came with the paper Attention is All You Need by Vaswani et al. in 2017. The transformer model replaced sequential processing with a self-attention mechanism, allowing it to weigh the importance of different words in a sentence simultaneously. This parallelization dramatically improved efficiency and scalability, laying the foundation for modern LLMs.

BERT, developed by Google, revolutionized natural language processing (NLP) by introducing bidirectional context — reading text both left-to-right and right-to-left simultaneously. This allowed BERT to better capture nuance, ambiguity, and word relationships. BERT’s “pre-train then fine-tune” approach became the gold standard, as it enabled models to learn general language representations from massive text corpora and adapt to specific tasks with minimal data.

OpenAI’s GPT models took a different path, focusing on generative capabilities.GPT-1 (2018)[5]: With 117 million parameters, GPT-1 showed that language models could generate coherent text by predicting the next word based on preceding words.GPT-2 (2019): [6]A leap forward, GPT-2 had 1.5 billion parameters and demonstrated impressive text generation, but was initially withheld due to concerns about misuse (e.g., generating fake news).[7]GPT-3 (2020): Boasting 175 billion parameters, GPT-3 pushed the boundaries of few-shot learning — the ability to perform tasks with minimal examples — and exhibited striking fluency, sparking mainstream interest and ethical debates.GPT-4 (2023): With even more advanced capabilities, GPT-4 improved multimodal learning (understanding both text and images), reinforcing the need for regulatory frameworks as LLMs grew more sophisticated.

Google’s T5 unified NLP tasks by reframing all problems as text-to-text transformations — for example, translation, summarization, and question-answering were treated as generating output text from input text. This simplified task formulation and improved generalization.Google’s PaLM introduced scaling laws — empirical findings on how model performance improves with size, data, and compute — pushing LLMs towards trillion-parameter models.

As LLMs grew in power, their societal impact became a major concern. These models, trained on vast amounts of internet data, often inherited biases and misinformation present in their training sets. Bias and Fairness: Studies have shown that LLMs can perpetuate racial, gender, and cultural biases. For example, biased associations (linking certain professions to particular genders) can influence AI-driven hiring systems or content moderation tools.

Misinformation and Deepfakes: Generative models, capable of creating hyper-realistic text, images, and videos, have raised alarms about AI-generated propaganda, impersonation, and fake news. Intellectual Property (IP) Issues: There’s ongoing debate about whether training on copyrighted content without consent violates intellectual property laws — a critical point for your thesis as you explore human expression and LLM coexistenceRegulatory Response: Governments and organizations have started drafting AI regulations. The European Union’s AI Act categorizes AI systems by risk levels, while the U.S. has called for AI Bill of Rights principles — emphasizing transparency, accountability, and human oversight.

The history of AI and LLMs is one of relentless progress, from symbolic AI’s early struggles to today’s transformer-based behemoths. With the rise of models like GPT-4, PaLM, and BERT, the line between human and machine-generated content has blurred, underscoring the urgent need for legal frameworks. As these models continue to evolve, balancing technological innovation with the preservation of human expression and ethical AI governance becomes paramount. Your dissertation’s focus on crafting a legal structure for LLMs is not just timely — it’sessential for shaping AI’s future in a way that centers human creativity and autonomy.

[1]https://thereader.mitpress.mit.edu/the-ancient-history-of-intelligent-machines/

[2]https://home.dartmouth.edu/about/artificial-intelligence-ai-coined-dartmouth

[3]https://www.cambridge.org/core/books/abs/cambridge-handbook-of-artificial-intelligence/gofai/FCF7D6DD921658FE8AE9F2A2B0FECBDD

[4]https://www.cloudthat.com/resources/blog/attention-mechanisms-in-transformers#:~:text=The%20Transformer%20architecture%2C%20introduced%20in,birth%20of%20the%20Transformer%20model.

[5]https://www.ibm.com/think/topics/gpt

[6]https://openai.com/index/better-language-models/

[7]https://www.sciencedirect.com/topics/computer-science/generative-pre-trained-transformer-3

Trending: Call for Papers Volume 6 | Issue 4: International Journal of Advanced Legal Research [ISSN: 2582-7340]

TECHNOLOGICAL ADVANCEMENTS AND LLMS – Saikat Mahanty