Friday, July 11, 2014

A style rule for CamelCase

for JavaScript, in particular.

The context

"bactrian camels have two humps" is an example of a string.

It is common for programming languages that identifiers (labels or names for variables and values) are not just arbitrary strings, but very limited ones. Besides, they are distinct entities with their own syntax due to the omission of the quotation marks.

camel27 is a typical identifier.

As an immediate consequence, the lack of the quotations marks as delimiter means that white space must now be excluded from identifiers.

The phrase bactrian camels have two humps is not one, but (a sequence of) five identifiers.

One strategy to overcome this limitation is the insertion of strokes: bactrian-camels-have-two-humps is an identifier in say Scheme and bactrian_camels_have_two_humps is a legal identifier in JavaScript.

CamelCase is another strategy for the conversion of multi-word phrases in a single word: bactrianCamelsHaveTwoHumps is now an identifier for JavaScript. And of course, JavaScript itself is another one.

The rule

I suggest the following rule for the generation of CamelCase identifiers:

Suppose the descriptor for the identifier in native language has the form

w_1 w_2 w_3 .... w_n

i.e. n words, separated by white space. For example,

the president of USA

Try to avoid one-letter words (such as I or a) in the given phrase.

Now convert the phrase into a CamelCase identifier by doing:

  1. For each of the n words, turn all but the first letter into small ones. So the example phrase now is

    the president of Usa
  2. Capitalize the first letter of the words w_2,..., w_n. So

    the President Of Usa
  3. Remove the white space between the words, so

    thePresidentOfUsa

The result is your identifier.

Some of the words (but not w_1 or two consecutive words) in the initial phrase may be decimals (e.g. 4321) and they remain unchanged. For example, the phrase Henry 5 of England would turns into the identifier Henry5OfEngland. In that case, in the word following the decimal, the first letter may remain unchanged, i.e. the of does not have to be changed to Of and the resulting identifier is Henry5ofEngland.

Remarks and examples

Many identifiers in JavaScript are ugly in the sense that they contain too many consecutive capital letters, and that makes them hard to read and memorize. If the rule would have been applied, things would look a little nicer. Some examples:

  • JavaScript is a allright, at least as a CamelCase identifier, but ECMAScript is bad. The phrase ECMA script converts to the identifier EcmaScript and this would be a much better writing.

  • JavaScriptObjectNotation would be allright, but JSON is a bad identifier and should have been Json, instead. JSON is a standard object in ECMAScript 5, and it is not only a badly, but also wrongly chosen identifier: usually only constants are exclusively written with capitals only.

  • The whole DOM is full of bad style identifiers, the most striking example is XMLHttpRequest. Two acronyms in two versions (XML and Http), what a mess. It should have been XmlHttpRequest.

Another particular challenge is the naming of Node.js related code, especially since there already is a different Node object in the standard. If we consider the dot as space between words, Node.js is the phrase Node js and that converts into NodeJs, according to our rule.

Wednesday, September 4, 2013

The HTML5 Canvas Handbook

This HTML file

CanvasHandbook.html
comprises
The HTML Canvas Handbook
an introduction to the <canvas> tag and comprehensive reference to the according JavaScript objects.

The whole HTML document is presented in a single file. A print version takes about 80-110 pages.

This blog post is meant to be the forum for this text. Comments and remarks are very welcome!

Friday, January 21, 2011

ElephantMark

a simple PHP documentation tool with Markdown markup

Sure, (X)HTML is the standard format for documentations, and that is great. This holds even more for people writing programs. But then, writing or reading HTML code directly is a bit awkward and a couple of lightweight markup languages have emerged to make that easier. In the past, I often used Perls POD. But recently, I discovered Markdown, which is even more convenient and versatile as a general tool.


I think, Markdown is also very suitable as a format in program comments. For example, wrapping a piece of code ... into code tags <code>....</code> is achieved by simply placing backticks around it. Putting a block of code inside a <pre><code>....</code></pre> with converting special characters (&, <, > etc.) into HTML entities is simply done by indenting the lines with 4 spaces or 1 tab. All this is very intuitive and effortless.


With ElephantMark, I now have a tool that puts this idea into practice for PHP. ElephantMark is two things: It states three short and simple rules that turn ordinary PHP comments into Markdown text pieces. Secondly, it is a script elephantmark.php, that actually does this conversion and can be used like phpdoc.


ElephantMark is no competition for standard documentation tools. Professional programmers, building big libraries, will need smarter tools with features like automatic cross references etc. The idea of using Markdown in source code comments is probably more interesting for people, that need to write in many different programming languages and use PHP only occasionally. ElephantMark is understood in five minutes, there is no entire new documentation language to learn.


In its current version 1, elephantmark.php makes use of the Markdown-to-HTML converter markdown.php, written by Michel Fortin. This is a great tool itself, thank you very much! Currently, one needs both these scripts for ElephantMark conversions. But maybe, there is a way to merge them into one file for future versions.


So, here is the script, which is its own manual:


elephantmark.php



Thursday, March 25, 2010

Change Logic

As part of the reconstruction of the computability and intelligence concept in terms of propositional logic, I am currently working on an effective formalization of (causal) processes. I realized, that this is a goal which is very similar to certain branches of modal logic, but that the approach is very different. Although the work itself is still in a premature state, I thought it would be interesting to work out these differences and explain some core ideas. This resulted in a paper of five pages, that starts with the following preface:


Suppose, temporal logic is the subject that looks for a language and logic to reason about processes and things that change in time. Then it seems, that this implies a thorough study of time itself. But this is wrong. Time is a philosophical burden and dead weight in temporal logic. We shouldn't try to associate events to a time structure, we only need to realize change during the process. This paradigm shift is one starting point of a research project called change logic.


However, the elimination of time from temporal logic may not be so surprising as it tries to sound here. Actually, in the standard modal logical reconstruction, the relation to a time structure via a Kripke model is also only temporary. Once the formal system is motivated and its soundness and completeness is shown, time becomes superfluous here as well and disappears. In fact and in return, it is also possible to attach a linear time structure to what will be introduced as change logic.


So in the end, the real change with change logic does not so much come from a new philosophical semantics, but from the fact, that the whole thing was pulled off without adding new constructs to the syntax. In other words, change logic is temporal logic without modal operators.


The whole text is located here.

Thursday, February 4, 2010

"Literal Mathematics"

There seems to be a real revolution in mathematics on its way, namely the long due formal standardization.

This process is maybe best elaborated in this hierachy of three designs:
  • MathML, now in version 3.0 (December 2009), an XML application with the basic sytax in two modes: Presentation Markup and Content Markup. [http://www.w3.org/TR/MathML]
Sure, in our times there is an inflation of revolutions. But I think, this one is a really big one, despite the unexciting appearance of any norm.

  • A standard will provide us with a common language. A common language for mathematicians is more than a lingua franca or English as the world language. Formal scientist always need to create the world first, before they can talk about it. But at present, there is not even an agreement on whether "natural numbers" start from 0 or 1. This is not freedom, but pure inefficiency.
  • Currently, each constructive idea requires a choice for a concrete programming language when it is implemented. "Higher" languages have an interface concept that abstracts the signature from the details of the implementation. But there are no real standards for the translation between different languages. Each language needs to re-implement all the libraries to be a useful one. Of course, there is XML now, and that is a big step (MathML, OpenMath, and OMDoc are XML, too). But for mathematical structures and theories, XML itself is all too general.
  • This standardization of mathematics is not a new foundation in the meta-mathematical sense. It is a syntactic agreement, not a semantical super-theory. It doesn't explain, what a mathematical object really is, it only defines, how we define one. I suppose, this whole movement comes so late, because it was everything but obvious from the tradition, that a global standard does not need a global ontology. (And maybe in the end, that is a new foundation, after all.)
  • Donald Knuth introduced literal programming as an emphasis on the idea that documents should be comprehensive for humans and computers alike. Many programming languages offer "literal programming" tools, but none of them complies to the promise.
Unfortunately, at present there is still a shortage of tools to efficiently and comfortably work with these new formats.

Tuesday, February 2, 2010

PropLogic

In The propositional logic project, I describe my objections about the common state of this subject as follows:

Propositional logic sure has become a standard notion of our scientifically educated society, maybe as much as the arithmetic systems of integers, rational and real numbers. Boolean operations are standards in many areas of our computerized daily life, digital logic is the mathematical structure behind the computer hardware and information software.


But different to arithmetic systems and other basic data structures like lists, matrices or regular expressions, propositional algebras are not part of the standard tool repertoire of programming languages. This is certainly due to the cost explosion (see e.g. the boolean satisfiability problem) of its default implementation. What we need, is a fast implementation, which allows these structures to be used as basic tools for other programs.


The other problem with propositional logic is its classic algebraization as a (free) boolean algebra, which is only an abstraction of the semantic structure of propositional logic. That way, we loose some of the information. In other words, we need an algebraization that also preserves the syntactic structure of propositional worlds.

I am happy to announce the release of PropLogic, a Haskell package that intents to fix these problems and that might serve as a general and useful tool.


Despite my original intent to write a compact implementation for a pretty compact theory, this distribution is overloaded in an attempt to explain all its aspects. I suppose, the best place to start is A little program for propositional logic and a Brief introduction to PropLogic.


The first of these two tutorials doesn't require prior Haskell knowledge. Any fast implementation of a propositional algebra also provides a fast SAT solver and there is an interest and competition for the quickest solution. I have no idea how my program performs compare to other existing algorithms out there, but I tried to illustrate with some data how good it does the job. (I must admit however, that my "fast" program has its limits, too.)


The thing seems to work properly as it is, but I would still like to do some polishing and upload it to Hackage, soon. It would be very nice to get a boost from the comments and reactions of the Haskell community.

Friday, November 6, 2009

A fast SAT solver

A decade ago, I developed a system for propositional logic, based on Prime Normal Forms. The main function takes an arbitrary propositional formula φ and returns its prime conjunctive normal form pcnf(φ). Implicitly, this algorithm solves the SAT problem, i.e. it provides a general decision method for the question, if a given formula φ is satisfiable or not, namely:
φ is satisfiable iff pcnf(φ) is not (the normal form of) 0, i.e. "false".


Obviously, the SAT problem is one of the hot issues in computer science and there is a demand for a fast algorithm. However, it has never been the focus of my own research project, which is rather dealing with a re-interpretation of modern logic. I just needed a system that provided me with the functionality of propositional logic and at that time, I didn't know of any available one. I suggested however, that the solution I found would satisfy a general demand and that my approach tackled some very deep insights into the matter. I published the mathematical theory in a paper. I also wrote a Java applet that works like an online pocket calculator for propositional logic and accompanied it with a couple of tutorials and introductions for all kinds of users.


Sketch of the method


In my publications I rather use the dual as the default, i.e. I consider Prime Disjunctive Normal Forms, the function is pdnf instead of pcnf, and the satisfiability problem becomes the validity problem. The algorithm for the pdnf function is not stochastic or heuristic in nature, it is a strictly deterministic and algebraic procedure. I'll try to sketch its basic features, but let me recall some (more or less) standard terminology and well-known facts, first:


  • A literal λ is either an atomic or a negated atomic formula, i.e. α or ¬α.

  • A normal literal conjunction or NLC γ is a conjunction of literals [λ1 ∧ ... ∧ λk] so that the atoms α1, ..., αk occuring in these literals are strict linearly ordered, according to some given linear order relation < on the chosen set of atoms. Each λi is a component of γ.

  • A disjunctive normal form or DNF Δ is a disjunction of NLC's [γ1∨...∨γn]. Each γi is a component of Δ. We all know, that each formula φ has an equivalent DNF Δ, written φ⇔Δ.

  • Given a NLC γ=[λ1 ∧ ... ∧ λk] and a DNF Δ=[γ1∨...∨γn]. We say that

    • γ is a factor of Δ, if γ implies (or is subvalent to) Δ, written γ⇒Δ.

    • γ is a prime factor of Δ, if it is a factor and none of its components λ1, ..., λk could be deleted without violating the subvalence γ⇒Δ.



  • A DNF Δ=[γ1∨...∨γn] is called a

    • prime DNF or PDNF, if the set of its components {γ1, ..., γn} is exactly the set of all its prime factors.

    • minimal DNF or MDNF, if there is no other equivalent DNF which is smaller in size. (The size of a DNF is the number of components and atom occurrences.)



  • Every propositional formula φ has an equivalent PDNF. This PDNF is unique (up to the order of its components). So the function pdnf that returns the equivalent PDNF pdnf(φ) for every given φ is a well-defined canonization of propositional logic.

  • Every φ also has an equivalent MDNF. But this MDNF is not unique in general. Is is however always a subset of the PDNF in the sense that each component of the MDNF must be a component of the PDNF.


Our goal is an implementation of the pdfn function, i.e. the construction of an equivalent PNDF Δ for each given formula φ. The real core of this function is the P-Procedure, which takes an arbitrary DNF Δ and returns the equivalent PDNF P-Procedure(Δ). A classical method to implement the P-Procedure is the Quine-McCluskey method. But that algorithm grows exponentially and is not feasible for other than small input DNF's. We need something else and we start with the idea of pairwise component minimalization and call this the M-Procedure:


  1. We take two components γL and γR of the given Δ and replace it by the components of, i.e. the MDNF [μ1∨...∨μm] of [γL∨γR]. Obviously, m is either 1 or 2, so this step can only decrease the size of Δ.

  2. We repeat the first step until no more changes can be applied.


The resulting DNF, denoted by M-Procedure(Δ), is what we call a pairwise minimal DNF or M2DNF, i.e. a DNF where each pair of components make a minimal DNF. It is easy to proof that

  • each PDNF is a M2DNF, and

  • each MDNF is a M2DNF.


But none of these two facts holds the other way round. M-Procedure(Δ) is neither the prime nor a minimal form of Δ, at least not in general. The M-Procedure is not a realization of the P-Procedure (hence the two different names). But it will serve us well in a proper implementation of the P-Procedure...


I suppose, that most people who spent some time and concentration on the SAT problem have tried this approach of an M-Procedure. It is not a trivial matter to understand why this has to fail. The notion of prime in propositional logic is probably motivated by the according concept in number theory. But a closer investigation of things reveals a surprising and fundamental difference between prime factors in propositional formulas and integers. This problem, but also its solution, stems from the analysis of binary DNF's [γL∨γR].


For every two NLC's γL and γR we write


  • minLR) for the MDNF of [γL∨γR], and

  • primLR) for the PDNF of [γL∨γR]


These functions min and prim have straight-forward implementations (of linear complexity) and they are not hard to explain. What is actually an interesting and crucial point here is the fact that

  • minLR) is made of either one or two components, as mentioned earlier,

  • primLR) is often the same as minLR), but there is also a situation where minLR)=[γL∨γR] and primLR)=[γL∨γR∨γc] is a 3-component DNF. For example, consider

    prim([AB], [¬BC]) = [[AB] ∨ [¬BC] ∨ [AC]]

    This third and new γc is what we call the c-prime.


Now we are able to implement the P-Procedure:

Algorithm P-Procedure(Δ)
begin
Δ' := M-Procedure(Δ) ;
repeat
(1.) Δ'' := Δ' ;
(2.) let Π be the set of all c-primes of component pairs
in Δ' ;
(3.) attach all the components of Π to Δ' ;
(4.) Δ' := M-Procedure(Δ') ;
until Δ' and Δ'' contain the same set of components ;
return Δ' ;
end.

The proof for the correctness of this P-Procedure is based on a deep result of what I called Completeness Theorem, saying that a DNF is a PDNF iff it is a c-complete M2DNF.


For its computational complexity holds: If n is the number of different atoms in Δ, then the P-Procedure needs no more than n repeat loops. This, together with the fact that the M-Procedure is of polynomial complexity, let me suggest that the P-Procedure is of polynomial complexity as well. And that, of course, would have been a suprising answer to the open P=NP problem. When I realized that, I spent some time to find evidence for or against my conjecture, but I was only able to deliver some lemmata and partial proofs, but no definite decision.


Links


All mentioned material is available on www.bucephalus.org. In particular: