Friday, June 20, 2014

Python Sets: Handy for Network Data

My Python-related posts seem to get the most reads, so here's another one!

A problem that comes up fairly often in networking is finding the number of occurrences of unique items in a large collection of data: let's say you want to find all of the unique IP addresses that accessed a website, traversed a firewall, got denied by an ACL, or whatever. Maybe you've extracted the following list from a log file:

1.1.1.1
2.2.2.2
3.3.3.3
1.1.1.1
5.5.5.5
5.5.5.5
1.1.1.1
2.2.2.2
...

and you need to reduce this to:

1.1.1.1
2.2.2.2
3.3.3.3
5.5.5.5

In other words, we're removing the duplicates. In low-level programming languages, removing duplicates is a bit of a pain: generally you need to implement an efficient way to sort an array of items, then traverse the sorted array to check for adjacent duplicates and remove them. In a language that has dictionaries (also known as hash tables or associative arrays), you can do it by adding each item as a key in your dictionary with an empty value, then extract the keys. In Python:

>>> items = ['1.1.1.1','2.2.2.2','3.3.3.3','1.1.1.1','5.5.5.5','5.5.5.5','1.1.1.1','2.2.2.2']
>>> d = {}
>>> for item in items:
...     d[item] = None
...
>>> d
{'5.5.5.5': None, '3.3.3.3': None, '1.1.1.1': None, '2.2.2.2': None}
>>> unique = d.keys()
>>> unique
['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2']


or, more concisely using a dictionary comprehension:

>>> {item:None for item in items}.keys()
['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2']


Python has an even better way, however: the "set" type, which emulates the mathematical idea of a set as a collection of distinct items. If you create an empty set and add items to it, duplicates will automatically be thrown away:

>>> s = set()
>>> s.add('1.1.1.1')
>>> s
set(['1.1.1.1'])
>>> s.add('2.2.2.2')
>>> s.add('1.1.1.1')
>>> s
set(['1.1.1.1', '2.2.2.2'])
>>> for item in items:
...     s.add(item)
...
>>> s
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])


Predictably, you can use set comprehensions just like list comprehensions to do the same thing as a one liner:

>>> {item for item in items}
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])


Or, if you have a list built already you can just convert it to a set:

>>> set(items)
set(['5.5.5.5', '3.3.3.3', '1.1.1.1', '2.2.2.2'])


Python also provides methods for the most common types of set operations: union, intersection, difference and symmetric difference. Because these methods accept lists or other iterables, you can quickly find similarities between collections of items:

>>> items
['1.1.1.1', '2.2.2.2', '3.3.3.3', '1.1.1.1', '5.5.5.5', '5.5.5.5', '1.1.1.1', '2.2.2.2']
>>> more_items = ['1.1.1.1','8.8.8.8','1.1.1.1','7.7.7.7','2.2.2.2']
>>> set(items).intersection(more_items)
set(['1.1.1.1', '2.2.2.2'])

>>> set(items).difference(more_items)
set(['5.5.5.5', '3.3.3.3'])


Have fun!

55 comments:

Unknown said...

Hi Jay,

Another cool think I tend to use for similar purposes is the "collections.Counter". This will give you both a unique set of keys *and* a count of each item.

For example, I might want to gather all the items from a device inventory that has serial-numbers, and then get a listing and count.

>> from collections import Counter

In the snippet below, the variable "sn" is a Table object (created via the Junos PyEZ library). The "sn" variable is iterable (like a dictionary), so I can use it to build a list via compression and then pass that as the input to the Counter constructor:

>>> catalog = Counter([item.desc for item in sn])
>>>

You can dump the entire collection using the "items()" method:

>>> # pretty-print the catalog items
>>> pprint( catalog.items() )
[('XFP-10G-SR', 2),
('MX FPC Type 2', 1),
('MX SCB', 2),
('8x 1GE(LAN), IQ2', 1),
('RE-S-1800x4', 2),
('Front Panel Display', 1),
('SFP-T', 2),
('DPCE 40x 1GE R', 1),
('DPC PMB', 4),
('DPCE 40x 1GE R EQ', 1),
('SFP-SX', 26),
('PS 1.2-1.7kW; 100-240V AC in', 3),
('DPCE 20x 1GE + 2x 10GE R', 1),
('MX480', 1),
('MX480 Midplane', 1)]

Or access a specific item by name:

>>> catalog['SFP-SX']
26

Hope this helps!

Jay Swan said...

collections.Counter is awesome. You saved me from writing a post about it!

sets, collections and itertools are probably my favorite "secrets" from the standard library.

Kirk Byers said...

Good stuff Jay. Sets are definitely underrated especially when combined with the set operations.

I used a set difference yesterday. Two email lists and I wanted to only send to the individuals that were in list A that were not in list B. Sets made this much easier.

I also liked your dictionary comprehensions and set comprehensions.

Anonymous said...

If you want create unique list try:

In [1]: list(set(['123', '213', '123']))
Out[1]: ['123', '213']

cv writing service said...

You post is really helpful in finding an IP address. You are a genius and helping us all to understand the computer errors and problems.

Unknown said...

very nice interview questions
vlsi interview questions
extjs interview questions
laravel interview questions
sap bi/bw interview questions
pcb interview questions
unix shell scripting interview questions

Unknown said...

thanks for sharing this blog
spring mvc interview questions
machine learning online training
servlet interview questions mytectra.in
wcf interview questions


Saro said...


Great thoughts you got there, believe I may possibly try just some of it throughout my daily life.

rpa Training in Chennai

rpa Training in bangalore

rpa Training in pune

blueprism Training in Chennai

blueprism Training in bangalore

blueprism Training in pune

rpa online training

shalinipriya said...

Thank you a lot for providing individuals with a very spectacular possibility to read critical reviews from this site.
Data Science with Python training in chenni
Data Science training in chennai
Data science training in velachery
Data science training in tambaram
Data Science training in OMR
Data Science training in anna nagar
Data Science training in chennai
Data science training in Bangalore

simbu said...

I am so proud of you and your efforts and work make me realize that anything can be done with patience and sincerity. Well I am here to say that your work has inspired me without a doubt.
java training in marathahalli | java training in btm layout

java training in jayanagar | java training in electronic city

java training in chennai | java training in USA

selenium training in chennai

Mounika said...

This looks absolutely perfect. All these tiny details are made with lot of background knowledge. I like it a lot. 
python training in pune
python online training
python training in OMR

ummayasri said...

I recently came across your blog and have been reading along. I thought I would leave my first comment.

Blue Prism Training Course in Pune

Blue Prism Training Institute in Bangalore

chitra pragya said...

Your good knowledge and kindness in playing with all the pieces were very useful. I don’t know what I would have done if I had not encountered such a step like this.

angularjs-Training in sholinganallur

angularjs-Training in velachery

angularjs Training in bangalore

angularjs Training in bangalore

angularjs Training in btm

cynthiawilliams said...

Useful content, I have bookmarked this page for my future reference.
RPA Training in Chennai
Robotics Process Automation Training in Chennai
RPA courses in Chennai
RPA Training
RPA course

Sumaya Manzoor said...

All your points are excellent, keep doing great work.
Selenium Training in Chennai
Best selenium training in chennai
iOS Training in Chennai
Digital Marketing Training in Chennai
.Net coaching centre in chennai
Android Training in Velachery
Android Course in Adyar
Android Training Tambaram

yuvaprithika said...

This is really too useful and have more ideas and keep sharing many techniques. Eagerly waiting for your new blog keep doing more.
Aws Training in Bangalore
Aws Course in Bangalore
Best Aws Training in Bangalore
hadoop classes in bangalore
Java Training in Bangalore
Best Java Training Institutes in Bangalore

Vicky Ram said...

Thank you for sharing this post.

toorizt
Education

Anbarasan14 said...

Thanks for sharing this useful information. Keep doing regularly.

English Speaking Course in JP Nagar Bangalore
Best Spoken English Coaching Center in JP Nagar
Spoken English Classes in Bangalore JP Nagar
French Training Institutes in JP Nagar
French Coaching Classes in JP Nagar
French Courses in JP Nagar
Best French Classes near me

Ram Ramky said...

I think this was one of the most interesting content I have read today. Please keep posting.
selenium Training in Chennai
Selenium Training Chennai
ios training institute in chennai
Digital Marketing Course in Chennai
.Net coaching centre in chennai
Best DOT NET Training in Chennai 
.net training
mvc training in chennai

Unknown said...

Nice post. I learned some new information. Thanks for sharing.

Xamarin Training in Chennai
Xamarin Course in Chennai
Xamarin Training
Xamarin Course
Xamarin Training Course
Xamarin Classes
Best Xamarin Course

LindaJasmine said...

Thanks for sharing such an amazing post. Your style of writing is very unique. It made me mesmerized in your words. Keep on writing.

Informatica Training in Chennai
Informatica Training Center Chennai
Best Informatica Training in Chennai
Informatica course in Chennai
Informatica Training center in Chennai
Informatica Training
Learn Informatica
Informatica course

mercyroy said...

Brilliant ideas that you have share with us.It is really help me lot and i hope it will help others also.update more different ideas with us.
Java Training in Kelambakkam
Java Training in Ashok Nagar
Java Training in Nolambur
Java Training center in Bangalore

VenuBharath2010@gmail.com said...

Awesome Post . Your way of expressing things makes reading very enjoyable. Thanks for posting.
Ethical Hacking Course in Chennai
Hacking Course in Chennai
Ethical Hacking Training in Chennai
Certified Ethical Hacking Course in Chennai
Ethical Hacking Course
Ethical Hacking Certification
IELTS coaching in Chennai
IELTS Training in Chennai

Rajesh said...

thanks for sharing this information
Blue Prism Training in Bangalore
Blue Prism Training in BTM
informatica Training in Bangalore
informatica Training in BTM
MEAN Stack Training in BTM
MEAN Stack Training in Bangalore
RPATraining in BTM
RPA Training in Bangalore

Anonymous said...

For Devops Training in Bangalore Visit:
Devops Training in Bangalore

Prwatech said...

I learned World's Trending Technology from certified experts for free of cost. I Got a job in decent Top MNC Company with handsome 14 LPA salary, I have learned the World's Trending Technology from Python training in pune experts who know advanced concepts which can help to solve any type of Real-time issues in the field of Python. Really worth trying instant approval blog commenting sites

Online Training Portal said...

Hey Nice Blog!! Thanks For Sharing!!! Wonderful blog & good post. It is really very helpful to me, waiting for a more new post. Keep Blogging ! Here is the best angular training with free Bundle videos .

contact No :- 9885022027.

svrtechnologies said...

thanks for posting such an useful info...

aws training

vijay said...

I liked your blog.Thanks for your interest in sharing the information.keep updating.
aws Training in Bangalore
python Training in Bangalore
hadoop Training in Bangalore
angular js Training in Bangalore
bigdata analytics Training in Bangalore
python Training in Bangalore
aws Training in Bangalore

vijay said...

Good Post! Thank you so much for sharing this pretty post, it was so good to read and useful to improve my knowledge as updated one, keep blogging.

aws Training in Bangalore
python Training in Bangalore
hadoop Training in Bangalore
angular js Training in Bangalore
bigdata analytics Training in Bangalore
python Training in Bangalore
aws Training in Bangalore

preethi minion said...

goodd
inplant training in chennai
inplant training in chennai
inplant training in chennai for it.php
italy web hosting
afghanistan hosting
angola hosting
afghanistan web hosting
bahrain web hosting
belize web hosting
india shared web hosting

hari said...

very nice
inplant training in chennai
inplant training in chennai
inplant training in chennai for it.php
Bermuda web hosting
Botswana hosting
armenia web hosting
dominican republic web hosting
iran hosting
palestinian territory web hosting
iceland web hosting

Shalini Kumar said...
This comment has been removed by the author.
Anand Shankar said...
This comment has been removed by the author.
hari said...

very nice
inplant training in chennai
inplant training in chennai for it
suden web hosting
tunisia hosting
uruguay web hosting
Bermuda web hosting
Botswana hosting
armenia web hosting
lebanon web hosting

hari said...

very good
inplant training in chennai
inplant training in chennai for it
suden web hosting
tunisia hosting
uruguay web hosting
Bermuda web hosting
Botswana hosting
armenia web hosting
lebanon web hosting

hari said...

good
Bermuda web hosting
Botswana hosting
armenia web hosting
lithuania shared web hosting
inplant training in chennai
inplant training in chennai for it
suden web hosting
tunisia hosting
uruguay web hosting

ammu said...

excellent blogs.....!!!
chile web hosting
colombia web hosting
croatia web hosting
cyprus web hosting
bahrain web hosting
india web hosting
iran web hosting
kazakhstan web hosting
korea web hosting
moldova web hosting

dras said...

Excellent post...very useful...
python training in chennai
internships in hyderabad for cse 2nd year students
online inplant training
internships for aeronautical engineering students
kaashiv infotech internship review
report of summer internship in c++
cse internships in hyderabad
python internship
internship for civil engineering students in chennai
robotics course in chennai

nivetha said...

GOOD..
internships for cse students in bangalore
internship for cse students
industrial training for diploma eee students
internship in chennai for it students
kaashiv infotech in chennai
internship in trichy for ece
inplant training for ece
inplant training in coimbatore for ece
industrial training certificate format for electrical engineering students
internship certificate for mechanical engineering students

Durai Moorthy said...

Very nice post. thanks for sharing with us.

aws Training in Bangalore
python Training in Bangalore
hadoop Training in Bangalore
angular js Training in Bangalore
bigdata analytics Training in Bangalore
python Training in Bangalore
aws Training in Bangalore

Durai Moorthy said...

I am really happy with your blog because your article is very unique and powerful for new reader.
aws Training in Bangalore
python Training in Bangalore
hadoop Training in Bangalore
angular js Training in Bangalore
bigdata analytics Training in Bangalore
python Training in Bangalore
aws Training in Bangalore

Reshma said...

Great post. keep sharing such a worthy information
Tally Course in Chennai
Tally training in coimbatore
Tally course in madurai
Tally Course in Hyderabad
Tally Training in Bangalore
Tally classes in coimbatore
Best tally training institute in bangalore
Tally Course in Bangalore
IELTS Coaching in Bangalore
German Classes in Bangalore

Rajesh Anbu said...

Thank you for sharing information. Wonderful blog & good post.
aws Training in Bangalore
python Training in Bangalore
hadoop Training in Bangalore
angular js Training in Bangalore
bigdata analytics Training in Bangalore
python Training in Bangalore
aws Training in Bangalore

Shalini Kumar said...

Such an exceptionally valuable article. Extremely intriguing to peruse this article. I might want to thank you for the endeavors you had made for
composing this amazing article.

digital marketing blog
digital marketing course fees
seo training in chennai
digital marketing blogs
blog for digital marketing
blog about digital marketing
digital marketing bloggers
digital marketing resources
search engine optimization guide
free search engine optimization tutorials
free SEO tutorials
seo training tutorials
digital marketing tutorials
free digital marketing resources
free SEO

easylearn said...

Thanks for sharing.I appreciate your efforts for producing such high quality content.Visit big data certification bangalore if you are looking for any big data related certification courses

Ismail said...

Thanks for sharing this useful information with us...
Digital Marketing Courses in Bangalore

subha said...

Being new to the blogging world I feel like there is still so much to learn. Your tips helped to clarify a few things for me as well as giving..thanks
Ai & Artificial Intelligence Course in Chennai
PHP Training in Chennai
Ethical Hacking Course in Chennai Blue Prism Training in Chennai
UiPath Training in Chennai

Jayalakshmi said...

keep up the good work. this is an Assam post. this to helpful, i have reading here all post. i am impressed. thank you. thi



Dot Net Training in Chennai | Dot Net Training in anna nagar | Dot Net Training in omr | Dot Net Training in porur | Dot Net Training in tambaram | Dot Net Training in velachery





Rashika said...

Good work and thank you for sharing this information. I congratulate your effort to do this.


Digital Marketing Training in Chennai | Certification | SEO Training Course | Digital Marketing Training in Bangalore | Certification | SEO Training Course | Digital Marketing Training in Hyderabad | Certification | SEO Training Course | Digital Marketing Training in Coimbatore | Certification | SEO Training Course | Digital Marketing Online Training | Certification | SEO Online Training Course

Aishu said...

Very nice blog.
IELTS Coaching in chennai

German Classes in Chennai

GRE Coaching Classes in Chennai

TOEFL Coaching in Chennai

spoken english classes in chennai | Communication training


Yasodha Varman said...
This comment has been removed by the author.
GS Web Technologies said...

Get amazing digital marketing services from the Best Digital Marketing Company in Zirakpur. Boost your sales, leads, and traffic at a very affordable price.


Our Services:

Website development Company in Zirakpur

Application development Company in Zirakpur

Graphic designing Company in Zirakpur

Social Media Company in Zirakpur



Contact Information:


Name: GS Web Technologies: Website, Mobile App Development, Graphic Designing and Digital Marketing Company in Zirakpur, India.


Address: 3rd Floor, Paras Down Town Square Mall, Zirakpur, Punjab – 140603, India.


Phone Number: +91- 9501406707

shanitha said...


Nice Post... waiting for your next post. I have learned some new information. thanks for sharing.

performance marketing

iteducationcentre said...

Great Article. Thanks for posting.
also, check Python Course in Pune