Category: programming
-
Python – Introduction to dryscrape for web scraping and taking screenshots
Installation Install Qt4 # download http://qt.nokia.com/downloads/sdk-linux-x11-32bit-cpp-offline wget http://www.developer.nokia.com/dp?uri=http%3A%2F%2Fsw.nokia.com%2Fid%2F8ea74da4-fec1-4277-8b26-c58cc82e204b%2FQt_SDK_Lin32_offline chmod u+x ./QtSdk-offline-linux-x86-v1.2.1.run sudo ./QtSdk-offline-linux-x86-v1.2.1.run # install Qt4 Library sudo apt-get install -y python-lxml qt4-qmake Python Libraries # install cssselect – fix “ImportError: No module named cssselect” sudo pip install cssselect # install webkit-server git clone https://github.com/niklasb/webkit-server.git webkit-server cd webkit-server sudo python setup.py install # install dryscrape…
-
Python – Image Similarity Comparison Using Several Techniques
image_similarity.py # -*- coding: utf-8 -*- “”” Installation of needed libraries sudo apt-get install -y python-pip sudo pip install PIL numpy “”” import os, time, re, urllib from PIL import Image import logging format= ‘%(asctime)s – %(levelname)s – %(filename)s:%(lineno)s – %(funcName)s() – %(message)s’ format= ‘%(asctime)s – %(filename)s:%(lineno)s – %(message)s’ logging.basicConfig(level=logging.DEBUG, format=format) logger = logging.getLogger(__name__) def…
-
Python – Exception and Traceback – Context and Variables
# -*- coding: utf-8 -*- from __future__ import print_function import time import sys import cPickle from time import strftime import inspect import traceback “”” Decorator Resources http://www.ellipsix.net/blog/2010/8/more-python-voodoo-optional-argument-decorators.html http://stackoverflow.com/questions/3931627/how-to-build-a-python-decorator-with-optional-parameters/3931903#3931903 http://typeandflow.blogspot.com/2011/06/python-decorator-with-optional-keyword.html http://pko.ch/2008/08/22/memoization-in-python-easier-than-what-it-should-be/ http://wiki.python.org/moin/PythonDecoratorLibrary#Memoize http://wiki.python.org/moin/PythonDecoratorLibrary#Asynchronous_Call http://code.activestate.com/recipes/496879-memoize-decorator-function-with-cache-size-limit/ “”” def main(): if True: pass return bb=”ten” aa=”wow” bigboy(bb, aa) def bigboy(tima, boo): y = “gee man” fooo(y) trace(msg=”great”) import datetime…
-
Python – Get the size of a directory tree
import os def dirsize(start_dirpath): total_size = 0 for dirpath, dirnames, filenames in os.walk(start_dirpath): for filename in filenames: filepath = os.path.join(dirpath, filename) total_size += os.path.getsize(filepath) return total_size
-
Python – Human Readable Bytes as KB, MB, GB, TB
def hbytes(num): for x in [‘bytes’,’KB’,’MB’,’GB’]: if num < 1024.0: return "%3.1f%s" % (num, x) num /= 1024.0 return "%3.1f%s" % (num, 'TB')
-
Python – yield a text file
def yield_file(filepath): with open(filepath, ‘r’) as f for line in f: yield line Usage: for line in yield_file(filepath): do_something(line) # whatever you want 🙂
-
Python – Pythonic Collection
Dictionaries 1. Unpythonic if b != None: a = b else: a = c 1. Pythonic a = b or c 1. Unpythonic if var is None: print(“var is not set”) 1. Pythonic if not var: print(“var is not set”) 1. Unpythonic print(“——————————“) 1. Pythonic print(“-“*30) Keys 1. Unpythonic mydict.has_key(key) 1. Pythonic key in mydict…
-
Unix – find files and directories using the find and grep commands
Find Files containing solr # python files, text files and java files grep –color –include=”*[py|txt|java]” -rHnE “^import [a-z,\s]+$” * # same as above with extended find . -regextype posix-extended -iregex “.*.py|.*.txt|.*.txt” -exec grep –color -rHnE “import|except” ‘{}’ \; -print #find . -name “*.py” -exec grep –color -rHnE “import|except” ‘{}’ \; -print Find Files containing solr…
-
Python – sh – call shell commands and process results
# -*- coding: UTF-8 -*- import os, sys def main(): res = sh(‘ls -l’) print res def sh(self, cmd_arg_str, errout=sys.stderr ): import subprocess r””” Popen a shell -> line or “line1 \n line2 …”, trim last \n “”” # crashes after pyqt QApplication() with mac py 2.5.1, pyqt 4.4.2 # subprocess.py _communicate select.error: (4, ‘Interrupted…
-
Postgres – Hamming distance in plpython
CREATE OR REPLACE FUNCTION util.hamming_distance (s1 text, s2 text) RETURNS integer /* select * from util.hamming_distance (‘hella3’, ‘hillo2’) */ AS $$ return sum([ch1 != ch2 for ch1, ch2 in zip(s1, s2)]) $$ LANGUAGE plpythonu;