10 Essential Python Code Snippets for Text Processing
Text processing is a fundamental skill for programmers, especially those working in Natural Language Processing (NLP). Below are 10 Python code snippets for common text processing tasks, each with a brief explanation and example usage. These snippets are beginner-friendly yet powerful for real-world applications.
1. Count Words in Text
This code counts the number of words in a given text.
def count_words(text):words = text.split()return len(words)# Example usagetext = "Hello, this is a sample text for processing in Python"word_count = count_words(text)print(f"Word count: {word_count}")
split() and returns their count with len().Word count: 82. Reverse Text
This code reverses the characters in a text string.
def reverse_text(text):return text[::-1]# Example usagetext = "Hello World"reversed_text = reverse_text(text)print(f"Reversed text: {reversed_text}")
[::-1] to reverse the string efficiently.Reversed text: dlroW olleH3. Search for a Word in Text
This code checks if a specific word exists in the text.
def search_word(text, word):words = text.split()if word in words:return f"The word '{word}' is found in the text"else:return f"The word '{word}' is not found in the text"# Example usagetext = "This is a sample text for processing"word_to_find = "sample"result = search_word(text, word_to_find)print(result)
in to check for the word’s presence.The word 'sample' is found in the text4. Count Word Frequency
This code counts how often each word appears in the text.
from collections import Counterdef word_frequency(text):words = text.split()word_counts = Counter(words)return dict(word_counts)# Example usagetext = "This is a text text for processing processing"frequencies = word_frequency(text)for word, count in frequencies.items():print(f"Word '{word}': {count} times")
Counter to create a dictionary of word frequencies.Word 'This': 1 timesWord 'is': 1 timesWord 'a': 1 timesWord 'text': 2 timesWord 'for': 1 timesWord 'processing': 2 times
5. Remove Stop Words
This code removes common words (stop words) from the text.
def remove_stop_words(text, stop_words):words = text.split()filtered_words = [word for word in words if word not in stop_words]return ' '.join(filtered_words)# Example usagetext = "This is a sample text for processing in Python"stop_words = ['This', 'is', 'in', 'for']filtered_text = remove_stop_words(text, stop_words)print(f"Text after removing stop words: {filtered_text}")
Text after removing stop words: a sample text processing Python6. Extract Unique Words
This code extracts unique words from the text.
def unique_words(text):words = text.split()return list(set(words))# Example usagetext = "This is a text text for processing processing"unique = unique_words(text)print(f"Unique words: {unique}")
set() to remove duplicates and returns a list of unique words.Unique words: ['This', 'is', 'a', 'text', 'for', 'processing']7. Convert Text to Lowercase
This code converts all characters in the text to lowercase.
def to_lowercase(text):return text.lower()# Example usagetext = "Hello WORLD! This is PYTHON"lowercase_text = to_lowercase(text)print(f"Lowercase text: {lowercase_text}")
lower() to standardize text case.Lowercase text: hello world! this is python8. Remove Punctuation
This code removes punctuation from the text.
import stringdef remove_punctuation(text):return text.translate(str.maketrans('', '', string.punctuation))# Example usagetext = "Hello, World! This is a sample text."clean_text = remove_punctuation(text)print(f"Text without punctuation: {clean_text}")
string.punctuation and translate() to remove all punctuation marks.Text without punctuation: Hello World This is a sample text9. Count Sentences
This code counts the number of sentences in the text.
def count_sentences(text):sentences = text.split('.')sentences = [s.strip() for s in sentences if s.strip()]return len(sentences)# Example usagetext = "This is a sample text. It has multiple sentences. Let's count them!"sentence_count = count_sentences(text)print(f"Sentence count: {sentence_count}")
Sentence count: 310. Replace Words
This code replaces a specific word with another in the text.
def replace_word(text, old_word, new_word):return text.replace(old_word, new_word)# Example usagetext = "This is a sample text for processing"new_text = replace_word(text, "sample", "example")print(f"Text after replacement: {new_text}")
replace() to swap one word for another.Text after replacement: This is a example text for processingTips for Advanced Text Processing
- Preprocessing: Combine these functions (e.g., lowercase + remove punctuation) for robust NLP pipelines.
- Libraries: Use
nltkorspacyfor advanced tasks like tokenization or part-of-speech tagging. - Scaling: For large texts, optimize with libraries or consider parallel processing.
- Customization: Adapt these snippets for specific needs, like handling different languages or file inputs.
These snippets provide a solid foundation for text processing and can be extended for more complex NLP tasks. Experiment with them to build powerful text analysis tools!

0 Comments