A little program for propositional logic
January 2010
Bucephalus.org

# Introduction

`PropLogic` is a Haskell package for propositional logic. It also contains a standalone executable of the same name, which is able to run a small collection of the functions defined in the package. Most of them are Prime Normal Form converters of some sort.

For example, consider the `spdnf` function that returns the Simplified Prime Disjunctive Normal Form of a given propositional formula

```  \$ ./PropLogic spdnf "[-[-x * y] * [-y + z] * --[-x + z] * [z + y + z]* true]"
[[x * z] + [-y * z]]```

i.e. `PropLogic` applies `spdnf` on `[-[-x * y] * [-y + z] * --[-x + z] * [z + y + z] * true]` and returns its unique equivalent Prime Normal Form, namely `[[x * z] + [-y * z]]`.

The power of a normalizer like `spdnf` and its relatives first of all lies in the fact that it solves all traditional problems of propositional logic: Need to know if a formula is satisfiable? Need to know if two formulas are equivalent? All this is immediately answered with `spdnf`: the example input is satisfiable, because the result is not `false`, and it is not valid, because the result is not `true`.

Another feature of our prime form normalizers lies hidden in the underlying algorithms: they are fast. In a straight-forward, traditional, or default fashion, the mentioned problems are of exponential complexity and the computational costs to solve them explode. The `PropLogic` implementations however should be feasible for input of any size. The second part of this introduction contains some data that translate "fast" into real numbers of seconds on a current standard computer.

The `PropLogic` package is written in the Haskell programming language. But no prior understanding of Haskell is required to handle the command or this document. It is written also for users that are just looking for a tool for solving propositional logical problems. In particular, it can be used as a fast SAT-solver.

# Preliminaries

## Installation

For a detailed description of the installation see [Inst]. We give a quick and dirty version of the installation process here, which enables us to run the program.

Shortcut for Windows users
PropLogic.exe
into a working directory. Start the terminal window (DOS command) and navigate (with `cd`) to that place. If you then run, say
`PropLogic help`
you should find out that it works.
Compilation under Unix-like systems
PropLogic-0.9.tar.gz
into an appropriate working directory. Extract the archive file with
`\$ tar -xvf PropLogic-0.9.tar.gz`
change to the newly created subdirectory with
`\$ cd PropLogic-0.9`
and compile with
`\$ ghc --make -o PropLogic Main.hs`
That should do it. Now check
`\$ ./PropLogic help`
to confirm that everything works fine.

## Conventions

Note, that we follow a common convention to use `\$` as the standard shell (or DOS) prompt to indicate the user input. When we write

```  \$ PropLogic pdnf "[p <-> [-q + r]]"
[[-p * q * -r] + [p * -q] + [p * r]]```

then this means that we input `PropLogic pdnf "[p <-> [-q + r]]"` and we obtain the result `[[-p * q * -r] + [p * -q] + [p * r]]`.

Unix-like systems however will not look in the working directory for the `PropLogic` command by default and it needs to be enforced to do so with the prefix `./` for the working directory. In other words, we have to modify the previous input line to

`  \$ ./PropLogic pdnf "[p <-> [-q + r]]"`

We continue this habit in the examples below.

Windows user should remember however to omit the prefix and replace the previous line by something like

`  C:\working\directory> PropLogic pdnf "[p <-> [-q + r]]"`

where `PropLogic` may also be written `proplogic`, as Windows is case-insensitive in this respect.

# Doing Propositional Logic with Prime Normal Forms

## What is a Propositional Formula?

The rules for the construction of formulas are represented by the following table.

Syntax of propositional formulas (in fancy notation)
`α` for every atom `α` atomic formula
`false` zero or false value
`true` unit or true value
`-φ` for every formula φ negation
`[φ1 * ... * φn]` for all formulas `φ1, ..., φn` with `n≥0` conjunction
`[φ1 + ... + φn]` for all formulas `φ1, ..., φn` with `n≥0` disjunction
`[φ1 -> ... -> φn]` for all formulas `φ1, ..., φn` with `n≥0` subjunction
`[φ1 <-> ... <-> φn]` for all formulas `φ1, ..., φn` with `n≥0` equijunction
where an atom `&alpha` is any nonempty sequence of letters and digits, except for the keywords `true` and `false`.

Note, that

• five example atoms (or atomic formulas) are given by

`  halloWorld   123   x   tea4two   1and2and3`

and two example formulas are given by

```  -[x + y + -[x * z] + [x -> y] + false + [-x + y]]
[x <-> --x <-> [y * z] <-> [y -> true -> z]]```
• Con-, dis-, sub- and equijunctions are multiary or n-ary, i.e. they may have any number of arguments. A unary conjunction, i.e. a conjunction of only one formula `phi` is written `[* phi]`. The nullary conjunction is `[*]`. The same holds of the other junctors, although one usually doesn't need these cases.

• The outer brackets `[...]` around the n-ary junctions are mandatory. That disambiguates the parsing of formulas and makes any preference rules obsolete.

• The notation here is called "fancy" because it is different from the "pure" notation, which is the real Haskell data type in the package implementation. In our context here however, we will only be dealing with the fancy version.

## What are DNFs and CNFs?

You will probably be familiar with the notion of a Disjunctive Normal Form (or DNF) and the Conjunctive Normal Form (or CNF).

A DNF is a disjunction of literal conjunctions, where a literal is either an atomic formula, or a negated atomic formula. For example,

`  [[-v * -w * x * -y * -z] + [v * w * -x] + [v * -x * z] + [-x * y]]`

Our definition of a DNF also requires each literal conjunction to be a Normal Literal Conjunction (or NLC) in the sense that the atoms its literals are in a strict ascending order. For example, the second literal conjunction `[v * w * -x]` is a NLC, because `v < w < x`.

Similarly (or dually, as it is called in the context of boolean algebras), a CNF is a conjunction of Normal Literal Disjunctions (or NLDs), i.e. disjunctions of literals with atoms in a strict ascending order. An example CNF is say

`  [[-1 + -2 + 3 + -4 + -5] * [1 + 2 + -3] * [1 + -3 + 5] * [-3 + 4]]`

In the sequel, we write `<=>` for the usual equivalence relation and `=>` for the subvalence (or consequence or entailment relation) on formulas. For example, `--x <=> [x * true]` and `[x * y] => x` are two true statements.

It is a commonly known fact that each formula `φ` has an equivalent DNF `Δ` and an equivalent CNF `Γ`.

## What is a Prime Normal Form?

Definition (PDNF and PCNF)

Let `γ = [λ1 * ... * λk]` be a NLC and `Δ = [γ1 + ... + γn]` a DNF. We say that
• `γ` is a factor of `Δ`, if `γ => Δ`.
• `γ` is a prime factor of `Δ`, if `γ` is a factor of `Δ` and there is no other different factor `γ` of `Δ` with `γ => γ' => Δ`. In other words, if we delete any of the `λi` in `γ`, then the resulting NLC `γ' = [λ1 * ... * λi-1 * λi+1 * ... * λk]` is no longer a factor of `Δ`.
• `Δ` is a Prime DNF or PDNF, if the components `γ1`, ..., `γn` are exactly the prime factors of `Δ` and if `γ1 < ... < γn` (where `<` is a standard linear order on NLCs).
Let `δ = [λ1 + ... + λk]` be a NLD and `Γ = [δ1 * ... * δn]` a CNF. We say that
• `δ` is a cofactor of `Γ`, if `Γ => δ`.
• `δ` is a prime cofactor or coprime of `Γ`, if `δ` is a cofactor of `Γ` and there is no other different cofactor `δ'` of `Γ` with `Γ => δ' => δ`.
• `Γ` is a Prime CNF or PCNF, if the components `δ1`,...,`δn` are exactly the coprimes of `Γ` and if `δ1 < ... < δn` (where `<` is a standard linear order on NLDs).

For example, `[x * -y * -z]` is a factor of the DNF `[[x * -z] + [y * z]]`, because

`  [x * -y * -z]  =>  [[x * -z] + [y * z]]`

But it is not a prime factor, because `[x * -z]` is also a factor and

`  [x * -y * -z]  =>  [x * -z]  =>  [[x * -z] + [y * z]]`

It is not possible to delete any more literals from `[x * -z]`, the NLCs `[* x]` and `[* -z]` are no factors of the given DNF anymore, and that makes `[x * -z]` a prime factor.

Alltogether, there are three prime factors of `[[x * -z] + [y * z]]`, namely `[x * y]`, `[x * -z]` and `[y * z]`. Their disjunction

`  [[x * y] + [x * -z] + [y * z]]`

is a PDNF.

(The three prime factors in this PDNF are properly ordered, i.e. `[x * y] < [x + -z] < [y + z]`. But we don't bother with the details of a proper definition for this linear order `<` on the NLCs.)

Important is however, that this PDNF `[[x * y] + [x * -z] + [y * z]]` is equivalent to the original DNF. And that is a general fact:

Theorem and Definition

For every propositional formula `φ` holds:
• There is exactly one PDNF `Δ` equivalent to `φ`. This is called the PDNF of `φ` and denoted by `pdnf(φ)`.
• There is exactly one PCNF `Γ` equivalent to `φ`. This is called the PCNF of `φ` and denoted by `pcnf(φ)`.

The `PropLogic` package provides two Haskell functions [HNota]

```  pdnf :: (Ord a) => PropForm a -> PDNF a
pcnf :: (Ord a) => PropForm a -> PCNF a```

that return these normal forms of given propositional formulas (based on a linearly ordered atom type `a`).

From the command line, we can call these two functions with

```  \$ ./PropLogic pdnf PROPFORM
\$ ./PropLogic pcnf PROPFORM```

respectively, where `PROPFORM` is a propositional formula in fancy notation and wrapped in double quotes. For example

```  \$ ./PropLogic pdnf "[[x * -z] + [y * z]]"
[[x * y] + [x * -z] + [y * z]]```
```  \$ ./PropLogic pcnf "[[x * -z] + [y * z]]"
[[x + y] * [x + z] * [y + -z]]```
```  \$ ./PropLogic pdnf "[x -> y -> x]"
[[-x * -y] + [x * y]]```

The formal definition of DNFs and CNFs necessarily implies nullary and unary junctions like `[*]` and `[+ x]`. But most users would prefer a simplified version, as we call it. For example, one would rather write

```  [[x * y] + -z + v]    instead of    [[x * -y] + [* -z] + [* v]]
[x * y * z]           instead of    [[+ x] * [+ y] * [+ z]]

etc. [SimpRules]. These abbreviated formulas are no DNFs and CNFs anymore in the strict formal sense. We call them simplified PDNFs and PCNFs or SPDNFs and SPCNFs, respectively. Accordingly, we have two command options

```  \$ ./PropLogic spdnf PROPFORM
\$ ./PropLogic spcnf PROPFORM```

to generate these slightly modified results. For example

```  \$ ./PropLogic pcnf "[[x * -z] + [y + z]]"
[* [x + y + z]]
\$ ./PropLogic spcnf "[[x * -z] + [y + z]]"
[x + y + z]```
```  \$ ./PropLogic pdnf "[[x * -z] + [y + z]]"
[[* x] + [* y] + [* z]]
\$ ./PropLogic spdnf "[[x * -z] + [y + z]]"
[x + y + z]```
```  \$ ./PropLogic pdnf "[[x ->  y] <-> [-x + y]]"
[+ [*]]
\$ ./PropLogic spdnf "[[x ->  y] <-> [-x + y]]"
true```
```  \$ ./PropLogic pcnf "[x <-> -x]"
[* [+]]
\$ ./PropLogic spcnf "[x <-> -x]"
false```

Most people would probably prefer to use these simplified versions in practice.

## How to solve problems with Prime Normal Forms?

Each of these functions `pdnf`, `pcnf`, `spdnf` and `spcnf` is a canonical normalizer or canonizer for the equivalence relation `<=>` on propositional formulas in the sense that (1.) every formula is equivalent to its canonization and (2.) two formulas are equivalent if and only if their canonizations are identical.

We can exploit this property of canonizers for the solution of the traditional problems of propositional logic, e.g. the question, if a formula is satisfiable. So in particular, each of the four canonizers is also a SAT-solver.

Theorem

Given two propositional formulas `φ` and `ψ`. Then
 1 `φ` is valid iff `spdnf(φ) = true` iff `spcnf(φ) = true` 2 `φ` is satisfiable iff `spdnf(φ) ≠ false` iff `spcnf(φ) ≠ false` 3 `φ => ψ` iff `spdnf([φ -> ψ]) = true` iff `spcnf([φ -> ψ]) = true` 4 `φ <=> ψ` iff `spdnf([φ <-> ψ]) = true` iff `spcnf([φ <-> ψ]) = true`

For example, `[x + -x]` is valid, because

```  \$ ./PropLogic spcnf "[x + -x]"
true```

## Intermediate result

This concludes our introduction to the `PropLogic` command as a tool for propositional logic.

Note, that you can always call for immediate help from the command itself by typing

`  \$ ./PropLogic help`

If you do so, you will find that `PropLogic` has some more options we didn't cover, yet. This involves just some more variations of the Prime Normal Form idea and we explain that in the next chapter.

The last chapter below then finally attempts to give an impression on the performance of the `PropLogic` command. Therefore, it also has an option `testing` that generates a series of randomly generated forms, generates their canonizations and reports some measurements for the whole process.

# Yet another variation of Prime Normal Forms

## X-Forms and I-Forms

There is yet another variation of the `pdnf` function, namely `xpdnf`. The "x" here simultaneously stands for "indexed" as well as "extended" [ThAlg].

Let us consider an example. First the PDNF

```  \$ ./PropLogic pdnf "[x -> y -> z]"
[[-x * -y] + [-x * z] + [y * z]]```

then the XPDNF of the same formula

```  \$ ./PropLogic xpdnf "[x -> y -> z]"
+------------------+
|      XPDNF       |
| +----+----+----+ |
| | x  | y  | z  | |
| +----+----+----+ |
| | -1 | -2 |    | |
| +----+----+----+ |
| | -1 |    | +3 | |
| +----+----+----+ |
| |    | +2 | +3 | |
| +----+----+----+ |
+------------------+```

It is easy to see that this table displays very much the same information than the PDNF. In the first row, there is the ordered lists of atoms: `x`, `y`, `z`. The other three rows are representations of the three NLCs of the PDNF, namely `[-1,-2]` stands for `[-x * -y]`, `[-1,3]` stands for `[-x * z]`, and `[2,3]` stands for `[y * z]`. The form is "indexed" in the sense that `1`, `2`, ... refer to the corresponding first, second, ... element in the ordered atom list, in this example case `[x,y,z]`.

The output of the `xpdnf` call is a two-dimensional table display. The underlying data structure in Haskell for these XPDNFs has the following components:

• The X-Form, here `([x,y,z],[[-1,-2],[-1,3],[2,3]]`, which is made of

1. The ordered atom list, here `[x,y.z]`

2. The I-Form, here `[[-1,-2],[-1,3],[2,3]]`, which is a list of three I-Lines `[-1,-2]`, `[-1,3]`, and `[2,3]` [COSTACK].

Each I-Line is a list `[i_1, i_2, ..., i_n]` of non-zero integers in an ordered fashion, namely `abs(i_1) < abs(i_2) < ... < abs(i_n)`.

• 3. An indication (i.e. a type constructor in Haskell), here "`XPDNF`", that tells how the X-Form needs to be translated into a propositional formula.

In the example, the X-Form `([x,y,z],[[-1,-2],[-1,3],[2,3]]` is supposed to be a `XPDNF`, so the X-Form translated into a formula is the DNF `[[-x * -y] + [-x * z] + [y * z]]`. If the X-Form would have been a `XPCNF`, then its conversion into a formula would have been the CNF `[[-x + -y] * [-x + z] * [y + z]]`.

The deeper reason for these X-Forms is the gain in speed. The "fast" canonizers that do the real work under the surface translate propositional formulas into X-Forms, and by far the most work is done on the I-Forms. In particular, it involves a lot of atom comparisons, and it is much "cheaper" to work on integers than comparing arbitrary atoms, e.g. strings.

Yet another advantage of the X- and I-Form abstraction is the fact that each function can be applied twice, for DNFs and CNFS, due to the dual character of propositional logic.

## Prime Normal Forms of I-Forms

An important example for the just mentioned exploitation of the dualism is the Haskell function

`  primForm :: IForm -> IForm`

that takes an IForm `q` and returns its Prime Normal Form `q'`. The DNF of `q'` is then the PDNF of the DNF of `q`, and the CNF of `q'` is the PCNF of the CNF of `q`.

This function can also be called from the command line with

`  \$ ./PropLogic primForm IFORM`

where `IFORM` is an I-Form, wrapped in double quote. For example,

```  \$ ./PropLogic primForm "[[-1,2,3],[1,2,3],[-1,2,-3],[-1,-2,3],[-1,-2,3],[-1,2,3]]"
[[-1,2],[-1,3],[2,3]]```

And this result can be read in two dual versions:

• `[[-1 * 2] + [-1 * 3] + [2 * 3]]` is the PDNF of the DNF `[[-1 * 2 * 3] + [1 * 2 * 3] + [-1 * 2 * -3] + [-1 * -2 * 3] + [-1 * -2 * 3] + [-1 * 2 * 3]]`

• `[[-1 + 2] * [-1 + 3] * [2 + 3]]` is the PCNF of the CNF `[[-1 + 2 + 3] * [1 + 2 + 3] * [-1 + 2 + -3] * [-1 + -2 + 3] * [-1 + -2 + 3] * [-1 + 2 + 3]]`

The `primForm` function will probably not be very interesting for the normal user. But it is the core function of all the other prime canonizers and it does the most work behind the scenes. It therefore is the candidate we are going to study for the performance of our whole Prime Normal Form approach to propositional logic.

# What is "fast"?

Generating Prime Normal Forms is a NP-hard problem [NP], we don't know if it can be done in polynomial time. Simply put, we don't know if there is a feasible algorithm for input of arbitrary size.

## Introduction

In this part we are going to present some empirical data on the computational costs of our `primForm` function. The actual `testing` function generates random input for `primForm` according to certain size parameters, and measures how long `primForm` takes to produce the results.

This concentration on just one function may need some justification. After all, the propositional algebras we implemented in the `PropLogic` package consist of about thirty functions. A full performance study should check them all. Besides, `primForm` works on I-Forms, but the average user would rather work with propositional formulas. However, note that

• most of the functions in the "fast" instances of propositional algebras are indeed of trivial or polynomial complexity. The harder ones, at least the theoretically harder ones, all use the prime form conversion, which does the essential and biggest part of the work.

• There is also a function to produce random propositional formulas. But it turns out, that the majority of these formulas are extreme, i.e. either tautologies or contradictions. This is not a flaw of the random generator, but a property of the data structure itself. The prime normal forms of extreme formulas are produced more or less instantanuously. But we are interested in the difficult cases, and random I-Forms are much better material input, as we will see. Recall, that I-Forms are just abstractions of DNFs and CNFs, and that converting a formula into either a DNF or CNF can be done in polynomial time [POLY]. The real work comes when DNFs and CNFs are converted into their prime normal forms.

• There is an infinite variety of NP-complete problems, but they are all related and the SAT solver or boolean satisfiability problem is a distinguished representative [SAT]. The `primForm` function is also a SAT solver and our data here is also designed to take part in the discussion about fast SAT solvers.

Recall, that this SAT solver works as follows: Given an I-Form `q` and let `q' := primForm(q)` be the prime normal form of `q`.

• If `q` is read as a DNF, then this DNF is

• satisfiable iff `q'` is not `[]`,

• valid iff `q'` is `[[]]`.

• If `q` is read as a CNF, then this CNF is

• satisfiable iff `q'` is not `[[]]`,

• valid iff `q'` is `[]`.

For example, the `primForm` call and result

```  \$ ./PropLogic primForm "[[-1,2,4],[1,4],[-3,4],,[-2,3]]"
[,,]```

tell us that the DNF

`  [[-1 * 2 * 4] + [1 * 4] + [-3 * 4] + [* 2] + [-2 * 3]]`

is satisfiable, but not valid, and that the CNF

`  [[-1 + 2 + 4] * [1 + 4] * [-3 + 4] * [+ 2] * [-2 + 3]]`

is also satisfiable and not valid [SatLog].

## Random forms and testing

We generate random I-Forms, apply `primForm` and then report the time that took. We do that many times, and thus obtain data for both the average and the worst case performance and we do all that with an average personal computer [Hardware].

### Size measures for I-Forms

Given an I-Form `Φ = [[i1,1,...,i1,n1],...,[im,1,...,im,nm]]` we define
• `atomNumber(Φ)`
is the atom number, i.e. the number `X` of the different `1,...,X` occurring either positive or negative
• `formLength(Φ) := m`
the I-Form length,
• `volume(Φ) := n1 + n2 + ... + nm`
the volume,
• `averageLineLength(Φ) := (n1+...+nm)/m`
the average I-Line length.

For example, the I-Form

`  [[-1,2,4],[1,4],[-3,4],,[-2,3]]`

has the atoms `1,2,3,4`, so the atom number is `4`. The I-Form length is `5`, the volume is `10` and the average I-Line length is `10/5=2`.

### The Random I-Form generator

The Haskell function for the random generation of I-Forms is [HAct]

`  randomIForm :: Int -> Int -> Int -> IO IForm`

i.e. we call it with three integer arguments

`  randomIForm X Y Z`

where `X` is the atom number, `Y` is the I-Form length, and `Z` is the average I-Line length.

For example, if we repeatedly call say

`  randomIForm 4 5 2`

we obtain I-Forms like

```  [[-1,-3,4],[2,3],[-4],[-1,-2],[1,-3]]
[[1,-3,-4],[-2,-4],[-1,2,3,-4],,[-4]]
[[2,4],[-1,-4],[1,-4],[-2],[-2,-3,-4]]
[[-2,-3],[1,-2],[-1,-2,3,4],[3,-4],[-1,2]]
etc.```

This random generator is not available from the command line directly. But it is part of the following `testing` function.

### The testing function

The `testing` function calls `randomIForm` a requested `N` number of times and has the syntax

`  PropLogic testing X Y Z N`

where `X` is the atom number, `Y` the I-Form length, `Z` is the average I-Line length and `N` is the number of test trials.

If we call e.g.

`  \$ ./PropLogic testing 4 5 2 1000`

we finally obtain something like this [Verbose]

```  +----------------------------------------------------------+
| All 1000 tests have been performed:                      |
|   Parameters for the random IForms were:                 |
|     atomNumber = 4                                       |
|     formLength = 5                                       |
|     averageLineLength = 2                                |
|   Result of the test series:                             |
|     numberOfTests = 1000                                 |
|     maxFormLength = 9                                    |
|     averageFormLength = 2                                |
|     maxTime = 4.001e-3 seconds                           |
|     averageTime = 1.6801500000000012e-4 seconds          |
|     standardTimeDeviation = 8.024281742156195e-4 seconds |
+----------------------------------------------------------+```

Obviously, this description first recalls the input parameters and then displays the the result of the test series, namely: the `numberOfTests` again, which was the input `1000`, and then the measurements for the worst (maximal) and the average results of the `primForm` conversion.

More precisely, if the `N` random forms were `p_1,...,p_N`, if `q_1,...q_N` are the according results, i.e. `q_i=primForm(p_i)`, and if this would have been obtained in `t_1,...,t_N` seconds, respectively, then the test result would show:

```  numberOfTests         = N
maxFormLength         = max { formLength(q_1), ..., formLength(q_N) }
averageFormLength     = (formLength(q_1) + ... + formLength(q_N)) / N
maxTime               = max { t_1, ..., t_N }
averageTime           = (t_1 + ... + t_N) / N
standardTimeDeviation = ((averageTime - t_1)^2 + ... + (averageTime - t_N)^2) / N```

### How to define Performance?

The computational complexity, i.e. the performance of an algorithm is usually expressed as a functional dependency of the time and space recources from the size of the input. The input of the `primForm` function is an I-Form. But we already have different notions to express the size of an I-Form: `atomNumber`, `formLength`, `averageLineLength` and `volume`. So which one is it?

Actually, one might argue that `volume` is the "real" size measure. But in fact, as we are going to demonstrate below, it is not the most suitable one for an analysis of the situtation we consider. From a theoretical point of view, the `atomNumber` is in fact the real parameter that expresses the complexity of the problem, and this is also the predominant one in the literature on complexity theory. However, given an atom number `X`, it does make a huge performance difference which length `Y` and average I-Line length `Z` we choose when we run `PropLogic testing X Y Z N`.

## Performance results

Let us now run our testing function by calling

`  \$ ./PropLogic testing X Y Z N`

where `X` is the `atomNumber`, `Y` is the `formLength` and `Z` is the `averageLineLength` of the `N` randomly generated I-Forms. But instead of simply accumulating data, let us establish some facts about the general performance behaviour of the `primForm` function.

### Note, that prime is not minimal

Calling

`  \$ ./PropLogic testing 5 10 4 1000`

gave us the final result

```  +-----------------------------------------------------------+
| All 1000 tests have been performed:                       |
|   Parameters for the random IForms were:                  |
|     atomNumber = 5                                        |
|     formLength = 10                                       |
|     averageLineLength = 4                                 |
|   Result of the test series:                              |
|     numberOfTests = 1000                                  |
|     maxFormLength = 17                                    |
|     averageFormLength = 8                                 |
|     maxTime = 2.0002e-2 seconds                           |
|     averageTime = 2.3721570000000006e-3 seconds           |
|     standardTimeDeviation = 2.6394240520141716e-3 seconds |
+-----------------------------------------------------------+```

Recall, that this means: we generated `1000` random I-Forms `p_1,...,p_1000` of length `10`, average I-Line length `4`, and thus a volume of approximately `10*4=40`. For each of those I-Forms, the Prime Normal Form `q_i=primForm(p_i)` was computed in `t_i` seconds. The maximal length of all these `p_1,...,p_1000` was `17`, the average length was `8`. The maximal time of the `t_1,...,t_1000` was about `20` milli-seconds, the average time was around `2` milli-seconds with an standard deviation of about 2 milli-seconds as well.

This example already reveals a general phenomenon: although the Prime Normal Forms `q_i` are most of the time shorter than the original I-Form `p_i`, in this case an average length of `8` compare to the length `10` of `p_i`, it may as well become longer, namely up to `17` in the given test trail.

In other words, Prime Normal Form and Minimal Normal Form are different, most of the time!

Actually, if `q = [l_1,...,l_k]` is a Prime Normal Form, an equivalent Minimal Normal Form `q' = [m_1,...,m_h]` is a selection of the I-Lines of `q`, i.e. `{m_1,...,m_h}` is a subset of `{l_1,...,l_k}`.

For example, `[[1,2],[1,3],[2,-3]]` is a Prime Normal Form and `[[1,3],[2,-3]]` is an equivalent Minimal Normal Form.

Well, if there are indeed Minimal Normal Forms, why are we not using them as standard canonizations, instead of the Prime Normal Forms? A full discussion and answer is given in [PNFCanon], but two crucial arguments are: 1. Minimal Normal Forms are not unique in general, there may be several equivalent ones, and 2. for the generation of Minimal Normal Forms it seems to be necessary to generate the Prime Normal Forms, so that their is neither a reduction of space, nor a reduction of time in the process.

### Increasing the length

If all other parameters stay the same, the increase of the `formLength` increases the costs considerably.

For example, if we keep

`  atomNumber = averageLineLength = 10`

then a variation of the `formLength` produced the following results:

```  formLength | maxFormLength  averageFormLength   maxTime averageTime
===========+=======================================================
10 |            10                  9     0.064       0.001
100 |            87                 77     0.248       0.148
500 |           452                403      91.9        56.1
1000 |           877                795      1659        1258```

Note in particular, that a length increase from `10` to `1000` results in a time increase from milli-seconds to almost half an hour.

Nevertheless, this increasing tendency is only local. From a certain point onwards, a further increase of the length results in smaller Prime Normal Forms. Recall, that for an atom number of `X`, there are only `2^X` different I-Lines of length `X`. The more I-Lines we add, the greater is the chance that the I-Forms collapses into the unit Prime Normal Form.

To demonstrate that, let us use an `atomNumber` of `4`, i.e. we run

`  \$ ./PropLogic testing 4 Y 4 10000`

for different `formLength=Y`. The following graph shows the decline of both the maximal and the average I-Form length of the Prime Normal Forms after a peak for around `formLength=14`. ### Increase the atom number and worst case analysis

A high `atomNumber` alone doesn't make problems unfeasible. For example, let us take

`  atomNumber = 1000`

Some examples for a relatively small `formLength` (with `averageLineLength` set to `500` and `N=100` tests) are:

```  formLength | maxFormLength averageFormLength  maxTime averageTime
===========+=====================================================
10 |            10                10    0.036       0.006
50 |            50                50    0.304       0.092
500 |           500               500    10.35         9.1
2000 |          2000              2000    308.7       304.3```

As we may suggest from the values of `maxFormLength` and `averageFormLength`, the input form is hardly changed, i.e. the random I-Forms are already in Prime Normal Form, but we don't care about a detailed explanation here. For an `atomNumber` of `1000` however, there are I-Forms with `<formLength ` `2^1000`> where all I-Lines are different and the Prime Normal Forms are not much smaller either [PrimNum], and that is beyond any limits we can hope for here.

The following test shall try to reveal the worst case behavior for a given `atomNumber=X`. We put

```  formLength        := 2^X
averageLineLength := X```

and slowly increase `X`. We obtain

```  atomNumber formLength averageLineLength numberOfTests  |  maxFormLength  maxTime
X        2^X                 X             N  |
=======================================================+=========================
2          4                 2           1000 |              2    0.004
3          8                 3           1000 |              6    0.004
4         16                 4           1000 |             13    0.008
5         32                 5           1000 |             24    0.056
6         64                 6           1000 |             46     0.28
7        128                 7           1000 |             92      2.2
8        256                 8           1000 |            193     18.0
9        512                 9            200 |            419      245
10       1024                10              5 |            833     2970```

### Conclusion

If we use an average equipment and if we accept a maximal waiting time up to say one minute, then the Prime Normal Form canonization is feasible for input forms with an atom number of 8.

### Additional Remark: Comparing the interpreted and compiled functions

The compiled function

`  PropLogic testing X Y Z N`

is the renamed version of the Haskell function `verboseRandomPrimeTesting`, which is called from within the `ghci` interpreter as

`  verboseRandomPrimeTesting (X,Y,Z) N`

Running some tests showed that the interpreted version is only about 15% slower.

[Inst]

http://www.bucephalus.org/text/InstallPropLogic/InstallPropLogic.html is a more detailed description of the installation and setup of the `PropLogic` package.

[HNota]

If you are not familiar with Haskell, the function signature such as

`  pdnf :: (Ord a) => PropForm a -> PDNF a`

probably looks awkward. But it is what more conventional mathematicians would rather write as

`  pdnf : PropForm(a) -> PDNF(a)`

i.e. the function `pdnf` has the propositional formulas on the parametric atom type `a` as its domain and the PDNFs on `a` as its codomain. In Haskell, the type operator "`:`" is written "`::`" (because "`:`" is reserved for list construction; "`x:l`" is the same as "`(cons x l)`" in LISP). The prefix `(Ord a) =>` means that this function is only defined on an atom type `a` that has a (linear) order structure defined on it. (This is an example of the type class concept, which is a pretty unique Haskell feature.)

[SimpRules]

The rules for the simplification are the following ones.

1. `[*] <=> true`
2. `[+] <=> false`
3. `[* φ] <=> φ`
4. `[+ φ] <=> φ`
5. `[φ1 * ... * φm * true * ψ1 * ... * ψn] <=> [φ1 * ... * φm * ψ1 * ... * ψn]`
6. `[φ1 + ... + φm + true + ψ1 + ... + ψn] <=> true`
7. `[φ1 * ... * φm * false * ψ1 * ... * ψn] <=> false`
8. `[φ1 + ... + φm + false + ψ1 + ... + ψn] <=> [φ1 + ... + φm + ψ1 + ... + ψn]`
[COSTACK]

In the Haskell implementation, the built-in Haskell lists are replaced by so-called costacks, i.e. I-Lines are costacks of integers and I-Forms are costacks of I-Lines.

Functional programming languages with LISP as a common ancestor implement lists as stacks, i.e. lists can only be modified by adding or removing a head component. Therefore, the concatenation of two lists is not a primitive operation, one first needs to walk throught the entire first list to then attach the second one. A costack or concatenable stack therefore is a list-like data structure, where concatenation is indeed a primitive operation, too.

The "fast" normalizers in the `PropLogic` package do their main work on I-Lines and I-Forms, and they do apply concatenations frequently. Therefore, the costack data structure is implemented in a separate `Costack` module. (In fact however, the real implementation of costacks in the `Costack` module is currently still done with ordinary Haskell lists. Different, more effective implementations are intended for future releases of the `PropLogic` package.)

[ThAlg]

In this introduction, we only concentrate on the "indexed" interpretation of the "X", i.e. how the atoms are replaced by positive integers to make the data structures more efficient. We neglect the "extended" meaning of the "X". On a deeper level, this very much has to do with our theory-algebraic reconstruction of propositional logic in favor of the traditional algebraization as a boolean algebra (see the other literature on http://www.bucephalus.org/PropLogic).

But also in the use of the command line, it shows us that the XPDNF is not just a another representation of the PDNF, but carries a little more information. Consider the formula `[x -> y -> [z + true]]`. The atom `z` is what we call redundant in the formula in the sense that there are equivalent formulas without `z`. Prime Normal Forms only contain irredundant atoms and we see that `z` is disappeared in the PDNF

```  \$ ./PropLogic pdnf "[x -> y -> [z + true]]"
[[* -x] + [* y]]```

But the XPDNF preserves all the atoms, it is equiatomic to the original formula

```  \$ ./PropLogic xpdnf "[x -> y -> [z + true]]"
+-----------------+
|      XPDNF      |
| +----+----+---+ |
| | x  | y  | z | |
| +----+----+---+ |
| | -1 |    |   | |
| +----+----+---+ |
| |    | +2 |   | |
| +----+----+---+ |
+-----------------+```

and it is not just

```  +-------------+
|    XPDNF    |
| +----+----+ |
| | x  | y  | |
| +----+----+ |
| | -1 |    | |
| +----+----+ |
| |    | +2 | |
| +----+----+ |
+-------------+```
[NP]

see e.g. http://en.wikipedia.org/P_versus_NP_problem for an introduction into the discussion in complexity theory.

[POLY]

Actuall, that the DNF and CNF conversion of propositional formulas can be done in polynomial time, is a questionable statement. At least for our special versions of propositional formulas, that also allow sub- and equijunctions. But this is not the main focus in complexity theory and we ignore the possible objections here. Besides, in the "fast" instances of propositional algebras in the `PropLogic` package are all based on data structures, where propositions are represented as DNFs or CNFs of some sort.

[SAT]

There are online discussions devoted to the SAT problem, e.g. http://www.satlive.org and http://www.satisfiability.org.

[SatLog]

In fact, the test for the satisfiability of a DNF (and the test for the validity of the CNF) are trivial: Due to the semi-ordered definition of our DNFs, every DNF other than `[+]` is satisfiable anyway, because if it has at least one NLC `[L1 * ... * Lm]`, then this NLC is satisfiable (because the `L1,...,Lm` cannot contain a complementary pair of literals), and so is the disjunction of multiple NLCs. So in fact, the satisfiability test is really only significant for CNF interpretations of I-Forms. But that doesn't change the general argument.

[Hardware]

Intel Pentium Dual CPU, 1.60 GHz; 2GB memory; Ubuntu Linux 8.10; Glasgow Haskell Compiler, Version 6.10.4.

[Verbose]

A call of the `testing` function produces very verbose output, for each of the `N` test cases it reports the full situation. This is done because when the random formulas become bigger, each `primForm` call becomes harder to perform and running test series may take long periods of time where nothing happens and a verbose version gives a better feeling for the behavior. On the other hand, this output-production adds to the time which is measured. But we neglect this artificial brake on the speed.

[HAct]

`  randomIForm :: Int -> Int -> Int -> IO IForm`

shows two more typical features of common Haskell functions: the three arguments are not given as a triple, the domain of the function is not a ternary cartesian product. The function is "currified" into a higher-order function of third degree.

And the type `IO IForm` is the type of "actions" on I-Forms, encapsulated in `IO` to preserve the purely functional character of the language. Otherwise, `randomIForm` wouldn't be a proper function, because the result is not deterministic; typical for a random generator. (The theoretical framework for all this is the famous "monad" concept of Haskell.)

[PNFCanon]

http://www.bucephalus.org/text/PNFCanon/PNFCanon.html The 36 pages paper called Theory and implementation of efficient canonical systems for sentential calculus, based on Prime Normal Forms is the main source for the mathematical background of the approach and algorithms in the `PropLogic` package.

[PrimNum]

There are studies on the length of Prime Normal Forms, such as: A. K. Chandra, George Markowsky: On the Number of Prime Implicants, Discrete Mathematics 24 (1978), p.7-11, (http://laptops.maine.edu/NumPrimeImplicants.pdf).