Sunday, September 16, 2012

count lines of code (cloc) in multiple PHP source folders



Over the years, I have used many tools that generate CLOC reports (how many lines of code you have written). The one I really like is cloc project on sourceforge . The tool is written in perl and is quite flexible.

Our problem is that we have a web application and that means

  • we need to ignore certain folders when counting lines. Examples would be compiled templates, generated minified files, third party libraries and such.
  •  We need to create custom definitions for some files (like .tmpl is our view templates and .inc are our PHP include files) Without custom definitions, the tool would count .inc as "PASCAL" file and ignore .tmpl files !!!
  • We need to sum the individual reports over. We have dependencies and the project is split across different source trees and not everything is under one root.

Ignoring folders/files with cloc is dead simple.  Just pass a file with files and directories in --exclude-list-file.

To generate custom definitions (so .tmpl are counted as view templates and .inc as PHP includes)  first use cloc.pl to write standard definition files using --write-lang-def switch. Later on we will modify this file and ask cloc to use our definition instead of the standard one.

Exclude list

Exclude list is a file with one entry per line.


rjha@mint13 ~/code/github/sc/deploy/apps/cloc $ cat cloc-ignore 
/home/rjha/code/github/sc/web/css/bundle-full.css
/home/rjha/code/github/sc/web/css/bundle.css
/home/rjha/code/github/sc/web/js/bundle.js
/home/rjha/code/github/sc/web/js/bundle-full.js
/home/rjha/code/github/sc/web/compiled




creating custom definitions file

First generate the definitions of extensions used by cloc tool. Then we add our custom definitions to this file. Later on we will use this file to supply file definitions to cloc tool. Just copy the  existing matching definitions for your custom file types.


 ./cloc-1.56.pl --write-lang-def=cloc.def

Add to this file following definitions

View Template
    filter remove_html_comments
    filter call_regexp_common HTML
    extension tmpl
    3rd_gen_scale 1.0


PHP Include
    filter remove_matches ^\s*#
    filter remove_matches ^\s*//
    filter call_regexp_common C
    filter remove_inline #.*$
    filter remove_inline //.*$
    extension inc
    3rd_gen_scale 1.0




script to sum reports


rjha@mint13 ~/code/github/sc/deploy/apps/cloc $ cat cloc.sh 
# web folder - use an ignore list
# web folder - needs custom definitions

./cloc-1.56.pl  --exclude-list-file=./cloc-ignore --read-lang-def=./cloc.def   /home/rjha/code/github/sc/web --report-file=sc.web.report
./cloc-1.56.pl  /home/rjha/code/github/sc/lib --report-file=sc.lib.report
./cloc-1.56.pl  /home/rjha/code/github/webgloo/lib/com/indigloo --report-file=webgloo.report
# sum the reports 
./cloc-1.56.pl  --read-lang-def=./cloc.def  --sum-reports *.report
#remove tmp
rm *.report




Here we are running cloc on three separate folders. First one uses an ignore list and a custom definition file (created earlier).  second and third are standard cloc reports.
Finally we use --sum-reports option to produce the final report across three different source trees.



rjha@mint13 ~/code/github/sc/deploy/apps/cloc $ ./cloc.sh 
     239 text files.
     239 unique files.                                          
      54 files ignored.
Wrote sc.web.report
      94 text files.
      94 unique files.                              
       0 files ignored.
Wrote sc.lib.report
      41 text files.
      41 unique files.                              
       0 files ignored.
Wrote webgloo.report

http://cloc.sourceforge.net v 1.56
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
PHP                            226           4795           1340          13256
PHP Include                     32            243             31            902
CSS                              1            219             33            850
Javascript                       1            249             99            823
View Template                   54            185              5            681
HTML                             4             19              0             46
XML                              2              0              0             44
-------------------------------------------------------------------------------
SUM:                           320           5710           1508          16602
-------------------------------------------------------------------------------





Friday, September 14, 2012

infinite scrolling but on steroids



we use infinite scrolling plugin by paul irish (https://github.com/paulirish/infinite-scroll) on 3mik now. However Problem with the vanilla plugin is that it assumes that the next URL only depends on current page number. Only  a naive pagination scheme would depend on only one variable.

As you go deeper into pages, you have to scan more and more records before you can arrive at the records you are interested in. For this reason,   real world pagination scheme rely on two variables (see these slides from percona conference  and this blog post )


  • current page number
  • last record id (of previous page ) or first record id (of next page)

An example would be the instagram API

To Apply infinite scrolling pattern to pages with such 2 variable pagination scheme, you need to do additional work. So here is our fork of infinite scroll plugin that documents this use case and provides sample server  PHP scripts.




© Life of a third world developer
Maira Gall