A project-based approach to learning Python programming for beginners. Intriguing projects teach you how to tackle challenging problems with code.
You've mastered the basics. Now you're ready to explore some of Python's more powerful tools. Real-World Python will show you how.
Through a series of hands-on projects, you'll investigate and solve real-world problems using sophisticated computer vision, machine learning, data analysis, and language processing tools. You'll be introduced to important modules like OpenCV, NumPy, Pandas, NLTK, Bokeh, Beautiful Soup, Requests, HoloViews, Tkinter, turtle, matplotlib, and more. You'll create complete, working programs and think through intriguing projects that show you how to:
"Sinopsis" puede pertenecer a otra edición de este libro.
Lee Vaughan is a programmer, pop culture enthusiast, educator, and author of Impractical Python Projects
(No Starch Press). As a former executive-level scientist at ExxonMobil, he spent decades constructing and reviewing complex computer models, developed and tested software, and trained geoscientists and engineers.
ATTRIBUTING AUTHORSHIP WITH STYLOMETRY
Stylometry is the quantitative study of literary style through computational text analysis. It’s based on the idea that we all have a unique, consistent, and recognizable style to our writing. This includes our vocabulary, our use of punctuation, the average length of our sentences and words, and so on.
A common application of stylometry is authorship attribution. Do you ever wonder if Shakespeare really wrote all his plays? Or if John Lennon or Paul McCartney wrote the song “In My Life”? Could Robert Galbraith, author of A Cuckoo’s Calling, really be J. K. Rowling in disguise? Stylometry can find the answer!
Stylometry has been used to overturn murder convictions and even helped identify and convict the Unabomber in 1996. Other uses include detecting plagiarism and determining the emotional tone behind words, such as in social media posts. Stylometry can even be used to detect signs of mental depression and suicidal tendencies.
In this chapter, you’ll use multiple stylometric techniques to determine whether Sir Arthur Conan Doyle or H. G. Wells wrote the novel The Lost World.
Project #2: The Hound, The War, and The Lost World
Sir Arthur Conan Doyle (1859–1930) is best known for the Sherlock Holmes stories, considered milestones in the field of crime fiction. H. G. Wells (1866–1946) is famous for several groundbreaking science fiction novels including The War of The Worlds, The Time Machine, The Invisible Man, and The Island of Dr. Moreau.
In 1912, the Strand Magazine published The Lost World, a serialized version of a science fiction novel. It told the story of an Amazon basin expedition, led by zoology professor George Edward Challenger, that encountered living dinosaurs and a vicious tribe of ape-like creatures.
Although the author of the novel is known, for this project, let’s pretend it’s in dispute and it’s your job to solve the mystery. Experts have narrowed the field down to two authors, Doyle and Wells. Wells is slightly favored because The Lost World is a work of science fiction, which is his purview. It also includes brutish troglodytes redolent of the morlocks in his 1895 work The Time Machine. Doyle, on the other hand, is known for detective stories and historical fiction.
THE OBJECTIVE
Write a Python program that uses stylometry to determine whether Sir Arthur Conan Doyle or H. G. Wells wrote the novel The Lost World.
THE STRATEGY
The science of natural language processing (NLP) deals with the interactions between the precise and structured language of computers and the nuanced, frequently ambiguous “natural” language used by humans. Example uses for NLP include machine translations, spam detection, comprehension of search engine questions, and predictive text recognition for cell phone users.
The most common NLP tests for authorship analyze the following features of a text:
• Word length A frequency distribution plot of the length of words in a document
• Stop words A frequency distribution plot of stop words (short, noncontextual function words like the, but, and if)
• Parts of speech A frequency distribution plot of words based on their syntactic functions (such as nouns, pronouns, verbs, adverbs, adjectives, and so on)
• Most common words A comparison of the most commonly used words in a text
• Jaccard similarity A statistic used for gauging the similarity and diversity of a sample set
If Doyle and Wells have distinctive writing styles, these five tests should be enough to distinguish between them. We’ll talk about each test in more detail in the coding section.
To capture and analyze each author’s style, you’ll need a representative corpus, or a body of text. For Doyle, use the famous Sherlock Holmes novel The Hound of the Baskervilles, published in 1902. For Wells, use The War of the Worlds, published in 1898. Both these novels contain more than 50,000 words, more than enough for a sound statistical sampling. You’ll then compare each author’s sample to The Lost World to determine how closely the writing styles match.
To perform stylometry, you’ll use the Natural Language Toolkit (NLTK), a popular suite of programs and libraries for working with human language data in Python. It’s free and works on Windows, macOS, and Linux. Created in 2001 as part of a computational linguistics course at the
University of Pennsylvania, NLTK has continued to develop and expand with the help of dozens of contributors.
"Sobre este título" puede pertenecer a otra edición de este libro.
EUR 6,94 gastos de envío desde Reino Unido a España
Destinos, gastos y plazos de envíoEUR 2,31 gastos de envío desde Reino Unido a España
Destinos, gastos y plazos de envíoLibrería: WorldofBooks, Goring-By-Sea, WS, Reino Unido
Paperback. Condición: Fine. Nº de ref. del artículo: GOR014428953
Cantidad disponible: 1 disponibles
Librería: Bellwetherbooks, McKeesport, PA, Estados Unidos de America
paperback. Condición: Good. Bruise/tear to cover. Nº de ref. del artículo: mon0000000506
Cantidad disponible: 1 disponibles
Librería: Bellwetherbooks, McKeesport, PA, Estados Unidos de America
paperback. Condición: Very Good. Very Good Condition - May show some limited signs of wear and may have a remainder mark. Pages and dust cover are intact and not marred by notes or highlighting. Nº de ref. del artículo: mon0000000505
Cantidad disponible: 7 disponibles
Librería: Better World Books: West, Reno, NV, Estados Unidos de America
Condición: Very Good. Used book that is in excellent condition. May show signs of wear or have minor defects. Nº de ref. del artículo: 48577808-75
Cantidad disponible: 1 disponibles
Librería: Rarewaves.com UK, London, Reino Unido
Paperback. Condición: New. Real World Python is a collection of worked projects for readers who know some basic Python and want to do something with their knowledge. The book's short projects all teach thought processes and problem-solving as well as coding syntax. Readers learn to think their way through challenges like predicting the location of sailors lost at sea, discovering new planets, determining the author of a novel, selecting candidate landing sites for a Mars rover, programming a robot sentry gun to detect and shoot aliens (not humans), and more. Nº de ref. del artículo: LU-9781718500624
Cantidad disponible: Más de 20 disponibles
Librería: Bookmans, Tucson, AZ, Estados Unidos de America
paperback. Condición: Good. Satisfaction 100% guaranteed. Nº de ref. del artículo: mon0002576746
Cantidad disponible: 1 disponibles
Librería: Rarewaves USA, OSWEGO, IL, Estados Unidos de America
Paperback. Condición: New. Real World Python is a collection of worked projects for readers who know some basic Python and want to do something with their knowledge. The book's short projects all teach thought processes and problem-solving as well as coding syntax. Readers learn to think their way through challenges like predicting the location of sailors lost at sea, discovering new planets, determining the author of a novel, selecting candidate landing sites for a Mars rover, programming a robot sentry gun to detect and shoot aliens (not humans), and more. Nº de ref. del artículo: LU-9781718500624
Cantidad disponible: Más de 20 disponibles
Librería: Kennys Bookshop and Art Galleries Ltd., Galway, GY, Irlanda
Condición: New. 2020. Paperback. . . . . . Nº de ref. del artículo: V9781718500624
Cantidad disponible: 15 disponibles
Librería: Rarewaves USA United, OSWEGO, IL, Estados Unidos de America
Paperback. Condición: New. Real World Python is a collection of worked projects for readers who know some basic Python and want to do something with their knowledge. The book's short projects all teach thought processes and problem-solving as well as coding syntax. Readers learn to think their way through challenges like predicting the location of sailors lost at sea, discovering new planets, determining the author of a novel, selecting candidate landing sites for a Mars rover, programming a robot sentry gun to detect and shoot aliens (not humans), and more. Nº de ref. del artículo: LU-9781718500624
Cantidad disponible: Más de 20 disponibles
Librería: Books Puddle, New York, NY, Estados Unidos de America
Condición: New. pp. 370. Nº de ref. del artículo: 26376886741
Cantidad disponible: 3 disponibles