Since the responses from a chatbot are based on the training data in the Large Language Model (LLM), the included material is crucial to the function of the chatbot. Many AI companies have simply scraped the entire internet for material to include in their training data. If this actually is the case, a lot of unwanted material was included on the fly and all social media posts and information from non-credible sources are reflected in the chatbot's generated responses.
The LLMs are based on probabilities, which means that the most common opinions and perspectives will be highlighted in the generated content. As users of AI tools, we need to consider which perspectives are likely to be most dominant in the training data and which perspectives and voices are not. It is difficult to get nuanced images or analyses of social phenomena because the most marginalized perspectives are not represented.
It is also possible for the private companies that are building chatbots to filter or shape chatbot responses based on what they see as undesirable without making it clear to the users. This lack of transparency raises concerns about bias, censorship, and who gets to control access to information.
AI companies are often not transparent about where or how they obtained the material that formed the training data for their language models. Some companies appear to have scraped the internet for texts without the copyright holders' permission. Since the training data contains copyrighted material, the generated responses are influenced by it which raises concerns about intellectual property.
It’s not always clear whether your prompts or the material you upload to chatbots are used as training data for the AI companies’ language models. In some chatbots, it’s possible to change the settings and disable this feature, but some require you to use the paid version to turn it off. Because of this, you should not upload material that contains sensitive or private information, or any material protected by copyright law, for example PDFs of scientific articles that you’ve downloaded from library databases. If you have text intended for a thesis, dissertation, or scientific publication, you should not submit it to a chatbot, as doing so may mean relinquishing your copyright to that material.
Another issue is whether the “prompter” or the AI company owns the generated response from a chatbot. In order to fully grasp this issue it is important to be aware that copyright laws can differ between different countries, which becomes problematic as most AI companies are based in the US but can be used worldwide. The question of copyright and AI is very complex and there are not enough legal cases to be able to determine with certainty what type of use is acceptable in which situations - yet. There have been some big lawsuits, for example Disney and Universal sued the image generation tool Midjourney in 2025 and a group of authors including George R.R. Martin sued OpenAI in 2023. It is therefore very important to think carefully before using chatbots because it is not possible to revoke private, sensitive or copyrighted material afterwards.
An important piece of legislation within the EU is the AI Act, which is designed to promote trustworthy AI and regulates the development, provision and use of AI. The regulation defines different levels of AI tools and the risk they pose: unacceptable risk, high risk, moderate risk and minimal risk.

Figure 6. Risk levels (Internetstiftelsen. 2024, 13 june). Translated.
Running an AI tool requires a lot of energy due to the amount of computing power that AI requires and water to cool the server halls. AI companies are fully not transparent about their energy use but it has been identified that both energy and water consumption have increased enormously in recent years alongside the success of chatbots.
It is difficult to determine exactly how much energy and water is required to run AI tools and it depends a lot on which tool you use and the nature of your prompt. However, due to the number of people using AI tools and the amount they are relying on it, the energy strain is enormous. Therefore it is important that you are aware of why you are using AI, how you are using it and whether it's necessary for the task at hand. Maybe a simple browser search can do just as well?
Further reading:
Language models are, as the name suggests, good at imitating language. When you interact with a chatbot, the answers you recieve are basically indistinguishable from something a human could have written. It may feel like you're communicating with another person, but when you break down the technology you see that it's all based on statistics and probabilities rather than consciousness or magic intelligence in the sky.
Generative AI is based on complex statistical models and is created with advanced deep learning techniques inspired by the neural networks in a human brain. There are algorithms designed to read context and circumstances, which is meant to generate a more relevant response based on your prompt. What generative AI cannot do is "understand" your prompt in the same way that a human can, even though it can be very good at generating relevant responses.
There are a lot of discussions going on about what intelligence is and how it should be defined. In some cases, the definition of intelligence can include chatbots and sometimes even the definition of consciousness can include chatbots. There are different philosophical perspectives to this, but as a user of AI tools, it is simply good to know the functionality behind the technology so as not to have inflated expectations of the results. A metaphor often used to describe generative AI is stochastic parrots, meaning that like parrots, chatbots can imitate human speech and expression but cannot understand what is being said.
This is what ChatGPT (GPT-4o) generated on the meaning of stochastic parrots (2025-07-25):
When large language models and chatbots are described as "stochastic parrots," it means that they:
In short, the term emphasizes that these systems, despite their impressive language abilities, are statistical machines — “parrots” that combine and reuse what they’ve seen, without true insight or intent.