Alright, let’s dive into this “why are the sharks so bad” thing. So, I got this idea, right? I wanted to mess around with some AI stuff, specifically looking at why some AI models just… suck. Sharks, in this case, because they’re cool, and hey, someone had to be the guinea pig.

Phase 1: Data Gathering (The Messy Part)
- First, I scraped a bunch of reviews. I figured user reviews of shark movies, toys, video games, anything shark-related, would be a good place to start. Used some basic Python with Beautiful Soup to pull text from various websites. Honestly, this took forever. So many sites are a pain to scrape.
- Then, cleaned up the data. Got rid of HTML tags, weird characters, all that jazz. Used regular expressions for this. It’s like, 80% cleaning, 20% actual fun stuff.
Phase 2: Sentiment Analysis (Let’s See the Hate)
- I decided to use a pre-trained sentiment analysis model from Hugging Face. Easiest way to get a quick idea of what people think. Loaded up the model, fed it all the reviews, and boom, got back sentiment scores for each one.
- Visualized the sentiment distribution. A lot more negative reviews than I expected! People are really disappointed with their shark experiences, apparently.
Phase 3: Topic Modeling (Digging Deeper)
- Next, I wanted to know why the sharks were so bad. So, topic modeling! I used LDA (Latent Dirichlet Allocation). It’s a way to automatically discover the main topics discussed in a bunch of text.
- Preprocessed the reviews again. This time, I lemmatized the words (turned “running” into “run”), removed stop words (like “the”, “a”, “is”), and created a document-term matrix.
- Ran the LDA model. Messed around with the number of topics until I got something that made sense. Turns out, people complain about bad CGI, stupid plots, and unrealistic shark behavior. Shocking, I know.
Phase 4: Word Clouds (Because Why Not?)
- Generated word clouds for each topic. Just a fun way to visualize the most frequent words. The “bad CGI” topic had words like “effects,” “graphics,” and “cheap.” The “stupid plot” topic had words like “plot,” “story,” “dumb,” and “idiotic.” You get the idea.
Phase 5: Trying to Make it Smarter (Failed Experiment)

- I got ambitious and tried to train a custom classifier to predict the “shark badness” score based on the review text. Used TensorFlow and Keras.
- Split the data into training and testing sets. Trained the model.
- The results were… not great. The model kept predicting everything was either “neutral” or “slightly negative.” I suspect I didn’t have enough data or the model was too simple. Maybe next time I’ll try a transformer model.
Conclusion
So, yeah, that’s my deep dive into why the sharks are so bad. Turns out, people hate bad CGI and dumb plots. Who knew? The AI part was mostly just a fun excuse to play with some Python libraries. The custom classifier didn’t work, but hey, you learn something new every time, right?
Next steps: Maybe try this again with a bigger dataset, or a different type of AI model. Or maybe I’ll just go watch a good shark movie. Jaws, anyone?