Friday, June 20, 2014

Python Sets: Handy for Network Data

My Python-related posts seem to get the most reads, so here's another one!

A problem that comes up fairly often in networking is finding the number of occurrences of unique items in a large collection of data: let's say you want to find all of the unique IP addresses that accessed a website, traversed a firewall, got denied by an ACL, or whatever. Maybe you've extracted the following list from a log file:

1.1.1.1
2.2.2.2
3.3.3.3
1.1.1.1
5.5.5.5
5.5.5.5
1.1.1.1
2.2.2.2
...

and you need to reduce this to:

1.1.1.1
2.2.2.2
3.3.3.3
5.5.5.5

In other words, we're removing the duplicates. In low-level programming languages, removing duplicates is a bit of a pain: generally you need to implement an efficient way to sort an array of items, then traverse the sorted array to check for adjacent duplicates and remove them. In a language that has dictionaries (also known as hash tables or associative arrays), you can do it by adding each item as a key in your dictionary with an empty value, then extract the keys. In Python:

>>> items = ['1.1.1.1','2.2.2.2','3.3.3.3','1.1.1.1','5.5.5.5','5.5.5.5','1.1.1.1','2.2.2.2']
>>> d = {}
>>> for item in items:
...     d[item] = None
...
>>> d
{'5.5.5.5': None, '3.3.3.3': None, '1.1.1.1': None, '2.2.2.2': None}
>>> unique = d.keys()
>>> unique
['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2']


or, more concisely using a dictionary comprehension:

>>> {item:None for item in items}.keys()
['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2']


Python has an even better way, however: the "set" type, which emulates the mathematical idea of a set as a collection of distinct items. If you create an empty set and add items to it, duplicates will automatically be thrown away:

>>> s = set()
>>> s.add('1.1.1.1')
>>> s
set(['1.1.1.1'])
>>> s.add('2.2.2.2')
>>> s.add('1.1.1.1')
>>> s
set(['1.1.1.1', '2.2.2.2'])
>>> for item in items:
...     s.add(item)
...
>>> s
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])


Predictably, you can use set comprehensions just like list comprehensions to do the same thing as a one liner:

>>> {item for item in items}
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])


Or, if you have a list built already you can just convert it to a set:

>>> set(items)
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])


Python also provides methods for the most common types of set operations: union, intersection, difference and symmetric difference. Because these methods accept lists or other iterables, you can quickly find similarities between collections of items:

>>> items
['1.1.1.1', '2.2.2.2', '3.3.3.3', '1.1.1.1', '5.5.5.5', '5.5.5.5', '1.1.1.1', '2.2.2.2']
>>> more_items = ['1.1.1.1','8.8.8.8','1.1.1.1','7.7.7.7','2.2.2.2']
>>> set(items).intersection(more_items)
set(['1.1.1.1', '2.2.2.2'])

>>> set(items).difference(more_items)
set(['5.5.5.5', '3.3.3.3'])


Have fun!

49 comments:

Jeremy Schulman said...

Hi Jay,

Another cool think I tend to use for similar purposes is the "collections.Counter". This will give you both a unique set of keys *and* a count of each item.

For example, I might want to gather all the items from a device inventory that has serial-numbers, and then get a listing and count.

>> from collections import Counter

In the snippet below, the variable "sn" is a Table object (created via the Junos PyEZ library). The "sn" variable is iterable (like a dictionary), so I can use it to build a list via compression and then pass that as the input to the Counter constructor:

>>> catalog = Counter([item.desc for item in sn])
>>>

You can dump the entire collection using the "items()" method:

>>> # pretty-print the catalog items
>>> pprint( catalog.items() )
[('XFP-10G-SR', 2),
('MX FPC Type 2', 1),
('MX SCB', 2),
('8x 1GE(LAN), IQ2', 1),
('RE-S-1800x4', 2),
('Front Panel Display', 1),
('SFP-T', 2),
('DPCE 40x 1GE R', 1),
('DPC PMB', 4),
('DPCE 40x 1GE R EQ', 1),
('SFP-SX', 26),
('PS 1.2-1.7kW; 100-240V AC in', 3),
('DPCE 20x 1GE + 2x 10GE R', 1),
('MX480', 1),
('MX480 Midplane', 1)]

Or access a specific item by name:

>>> catalog['SFP-SX']
26

Hope this helps!

Jay Swan said...

collections.Counter is awesome. You saved me from writing a post about it!

sets, collections and itertools are probably my favorite "secrets" from the standard library.

Kirk Byers said...

Good stuff Jay. Sets are definitely underrated especially when combined with the set operations.

I used a set difference yesterday. Two email lists and I wanted to only send to the individuals that were in list A that were not in list B. Sets made this much easier.

I also liked your dictionary comprehensions and set comprehensions.

Anonymous said...

If you want create unique list try:

In [1]: list(set(['123', '213', '123']))
Out[1]: ['123', '213']

custom essay said...

You post is really helpful in finding an IP address. You are a genius and helping us all to understand the computer errors and problems.

Gokul Ravi said...

very nice interview questions
vlsi interview questions
extjs interview questions
laravel interview questions
sap bi/bw interview questions
pcb interview questions
unix shell scripting interview questions

Gokul Ravi said...

really awesome blog
hr interview questions
hibernate interview questions
selenium interview questions
c interview questions
c++ interview questions
linux interview questions


Gokul Ravi said...

thanks for sharing this blog
spring mvc interview questions
machine learning online training
servlet interview questions mytectra.in
wcf interview questions


Gokul Ravi said...

nice blog
android training in bangalore
ios training in bangalore

viswanath said...

AWS Training in Bangalore - Live Online & Classroom
myTectra Amazon Web Services (AWS) certification training helps you to gain real time hands on experience on AWS. myTectra offers AWS training in Bangalore using classroom and AWS Online Training globally. AWS Training at myTectra delivered by the experienced professional who has atleast 4 years of relavent AWS experince and overall 8-15 years of IT experience. myTectra Offers AWS Training since 2013 and retained the positions of Top AWS Training Company in Bangalore and India.

IOT Training in Bangalore - Live Online & Classroom
IOT Training course observes iot as the platform for networking of different devices on the internet and their inter related communication. Reading data through the sensors and processing it with applications sitting in the cloud and thereafter passing the processed data to generate different kind of output is the motive of the complete curricula. Students are made to understand the type of input devices and communications among the devices in a wireless media.

gowsalya said...

I wish to show thanks to you just for bailing me out of this particular trouble.As a result of checking through the net and meeting techniques that were not productive, I thought my life was done.
Digital Marketing Training in Chennai

Digital Marketing Training in Bangalore
Digital Marketing Training in Pune

Saro said...


Great thoughts you got there, believe I may possibly try just some of it throughout my daily life.

rpa Training in Chennai

rpa Training in bangalore

rpa Training in pune

blueprism Training in Chennai

blueprism Training in bangalore

blueprism Training in pune

rpa online training

digi mark said...

Great thoughts you got there, believe I may possibly try just some of it throughout my daily life.

rpa Training in Chennai

rpa Training in bangalore

rpa Training in pune

blueprism Training in Chennai

blueprism Training in bangalore

blueprism Training in pune

rpa online training

shalinipriya said...

Thank you a lot for providing individuals with a very spectacular possibility to read critical reviews from this site.
Data Science with Python training in chenni
Data Science training in chennai
Data science training in velachery
Data science training in tambaram
Data Science training in OMR
Data Science training in anna nagar
Data Science training in chennai
Data science training in Bangalore

simbu said...

I am so proud of you and your efforts and work make me realize that anything can be done with patience and sincerity. Well I am here to say that your work has inspired me without a doubt.
java training in marathahalli | java training in btm layout

java training in jayanagar | java training in electronic city

java training in chennai | java training in USA

selenium training in chennai

Mouni yoga said...

This looks absolutely perfect. All these tiny details are made with lot of background knowledge. I like it a lot. 
python training in pune
python online training
python training in OMR

ummayasri said...

I recently came across your blog and have been reading along. I thought I would leave my first comment.

Blue Prism Training Course in Pune

Blue Prism Training Institute in Bangalore

chitra pragya said...

Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.

angularjs-Training in sholinganallur

angularjs-Training in velachery

angularjs Training in bangalore

angularjs Training in bangalore

angularjs Training in btm

cynthia williams said...

Useful content, I have bookmarked this page for my future reference.
RPA Training in Chennai
Robotics Process Automation Training in Chennai
RPA courses in Chennai
RPA Training
RPA course

LindaJasmine said...

Interesting Post. I liked your style of writing. It is very unique. Thanks for Posting.

Node JS Training in Chennai
Node JS Course in Chennai
Node JS Advanced Training
Node JS Training Institute in chennai
Node JS Training Institutes in chennai
Node JS Course

Shiva Shakthi said...

The blog which you are shared is very much helpful for us to knew about the web designing. thanks for your information.
Web Designing Institute
Best Web Design Courses
Web Design Training Courses
Learn Website Design
Best Way to Learn Web Design

Sumaya Manzoor said...

All your points are excellent, keep doing great work.
Selenium Training in Chennai
Best selenium training in chennai
iOS Training in Chennai
Digital Marketing Training in Chennai
.Net coaching centre in chennai
Android Training in Velachery
Android Course in Adyar
Android Training Tambaram

yuva prithika said...

This is really too useful and have more ideas and keep sharing many techniques. Eagerly waiting for your new blog keep doing more.
Aws Training in Bangalore
Aws Course in Bangalore
Best Aws Training in Bangalore
hadoop classes in bangalore
Java Training in Bangalore
Best Java Training Institutes in Bangalore

Vicky Ram said...

Thank you for sharing this post.

toorizt
Education

Anbarasan14 said...

Thanks for sharing this useful information. Keep doing regularly.

English Speaking Course in JP Nagar Bangalore
Best Spoken English Coaching Center in JP Nagar
Spoken English Classes in Bangalore JP Nagar
French Training Institutes in JP Nagar
French Coaching Classes in JP Nagar
French Courses in JP Nagar
Best French Classes near me

ram ramky said...

I think this was one of the most interesting content I have read today. Please keep posting.
selenium Training in Chennai
Selenium Training Chennai
ios training institute in chennai
Digital Marketing Course in Chennai
.Net coaching centre in chennai
Best DOT NET Training in Chennai 
.net training
mvc training in chennai

sharmi chithra said...

Nice post. I learned some new information. Thanks for sharing.

Xamarin Training in Chennai
Xamarin Course in Chennai
Xamarin Training
Xamarin Course
Xamarin Training Course
Xamarin Classes
Best Xamarin Course

LindaJasmine said...

Thanks for sharing such an amazing post. Your style of writing is very unique. It made me mesmerized in your words. Keep on writing.

Informatica Training in Chennai
Informatica Training Center Chennai
Best Informatica Training in Chennai
Informatica course in Chennai
Informatica Training center in Chennai
Informatica Training
Learn Informatica
Informatica course

mercyroy said...

Brilliant ideas that you have share with us.It is really help me lot and i hope it will help others also.update more different ideas with us.
Java Training in Kelambakkam
Java Training in Ashok Nagar
Java Training in Nolambur
Java Training center in Bangalore

venu bharath said...

Awesome Post . Your way of expressing things makes reading very enjoyable. Thanks for posting.
Ethical Hacking Course in Chennai
Hacking Course in Chennai
Ethical Hacking Training in Chennai
Certified Ethical Hacking Course in Chennai
Ethical Hacking Course
Ethical Hacking Certification
IELTS coaching in Chennai
IELTS Training in Chennai

suresh said...

Amazing Post Thanks for sharing

Data Science Training in Chennai

DevOps Training in Chennai

Hadoop Big Data Training

Python Training in Chennai

yashnit said...

I ReGreat For Your Information The Information U have Shared Is Fabulous And Interesting So Please keep Updating Us The Information Shared Is Very Valuable Time Just Went On Reading The Article Python Online Course AWS Online Course Data Science Online Course Hadoop Online Course

raybon said...

Excellent post!!!. The strategy you have posted on this technology helped me to get into the next level and had lot of information in it.

devops online training

aws online training

data science with python online training

data science online training

rpa online training

rose said...

Thanks for the informative article. This is one of the best resources I have found in quite some time. Nicely written and great info. I really cannot thank you enough for sharing.
Microsoft Azure online training
Selenium online training
Java online training
Python online training
uipath online training

venu bharath said...

You are an excellent writer. Amazing use of words. Waiting for your future updates.
Blockchain certification
Blockchain course
Blockchain Training
Blockchain courses in Chennai
Blockchain course in Adyar
Blockchain Training in Anna Nagar

Priyanka said...

Attend The Digital Marketing courses in bangalore From ExcelR. Practical Digital Marketing courses in bangalore Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Digital Marketing courses in bangalore.
Digital Marketing Courses in Bangalore

Rajesh said...

thanks for sharing this information
Blue Prism Training in Bangalore
Blue Prism Training in BTM
informatica Training in Bangalore
informatica Training in BTM
MEAN Stack Training in BTM
MEAN Stack Training in Bangalore
RPATraining in BTM
RPA Training in Bangalore

Chris Hemsworth said...

The article is so informative. This is more helpful for our
software testing training in chennai
selenium training in chennai
software testing training online
Thanks for sharing.

SAP Training in Chennai said...

Very Good Blog. Highly valuable information have been shared. Highly useful blog..Great information has been shared. We expect many more blogs from the author. Special thanks for sharing..
SAP Training in Chennai | AWS Training in Chennai | Android Training in Chennai | Selenium Training in Chennai | Networking Training in Chennai

pythonclassesinpune said...

I like viewing web sites which comprehend the price of delivering the excellent useful resource Python classes in pune free of charge. I truly adored reading your posting. Thank you!

Aruna Ram said...

This post is very impressive for me. I read your whole blog and I really enjoyed your article. Thank you...!
Pega Training in Chennai
Pega Course in Chennai
Excel Training in Chennai
Corporate Training in Chennai
Embedded System Course Chennai
Linux Training in Chennai
Spark Training in Chennai
Tableau Training in Chennai
Pega Training in Tambaram
Pega Training in Porur

Rahuldevan said...

Thanks for sharing informative article with us...
QTP Training in Chennai
Qtp classes in chennai
qtp training institutes in chennai
qtp training in Thiruvanmiyur
QTP Training in OMR
LoadRunner Training in Chennai
Html5 Training in Chennai
clinical sas training in chennai
Spring Training in Chennai
Photoshop Classes in Chennai

divi said...

thanks for your information really good and very nice web design company in velachery

tech updates said...

nice blog
devops training in bangalore
hadoop training in bangalore
iot training in bangalore
machine learning training in bangalore
uipath training in bangalore

easylearn said...

Hi,
Good job & thank you very much for the new information, i learned something new. Very well written. It was sooo good to read and usefull to improve knowledge. Who want to learn this information most helpful. One who wanted to learn this technology IT employees will always suggest you take big data hadoop training in pune. Because big data hadoop course in pune is one of the best that one can do while choosing the course.

Updated Tech News said...

For Devops Training in Bangalore Visit:
Devops Training in Bangalore

Pallavi said...

I learned World's Trending Technology from certified experts for free of cost. I Got a job in decent Top MNC Company with handsome 14 LPA salary, I have learned the World's Trending Technology from Python training in pune experts who know advanced concepts which can help to solve any type of Real-time issues in the field of Python. Really worth trying instant approval blog commenting sites

Vijiaajith said...

Very nice
freeinplanttrainingcourseforECEstudents
internship-in-chennai-for-bsc
inplant-training-for-automobile-engineering-students
freeinplanttrainingfor-ECEstudents-in-chennai
internship-for-cse-students-in-bsnl
application-for-industrial-training

Vijiaajith said...

nice
interview-questions/aptitude/permutation-and-combination/how-many-groups-of-6-persons-can-be-formed

tutorials/oracle/oracle-delete

technology/chrome-flags-complete-guide-enhance-browsing-experience/

interview-questions/aptitude/time-and-work/a-alone-can-do-1-4-of-the-work-in-2-days


interview-questions/programming/recursion-and-iteration/integer-a-40-b-35-c-20-d-10-comment-about-the-output-of-the-following-two-statements