# Data-type implementation for multiple dynamic histograms

Posted on

Problem

This is almost exercise 3.2.14. from the book Computer Science An Interdisciplinary Approach by Sedgewick & Wayne (since I am self-studying, I changed it a little bit):

Develop a version of Histogram that uses StdDraw, so that a client can
create multiple histograms. Use a test client that creates histograms
for flipping coins (Bernoulli trials) with a biased coin that is heads
with probability p, for p = 0.2, 0.4, 0.6. and 0.8, taking the number
of trials from the command line.

Here are my programs:

``````public class Histogram {
private final double[] data;
private final double max;
public Histogram(double[] data, double max) {
this.data = data;
this.max = max;
StdDraw.setXscale(0, data.length);
StdDraw.setYscale(0, max * 3);
}
public double[] getData() {
return data;
}
public int findMax() {
double max = 0;
int dataLength = data.length;
for (int i = 0; i < dataLength; i++) {
max = Math.max(max, data[i]);
}
return (int) max;
}
data[index]++;
}
public void draw(double xIncrement, double yIncrement) {
StdDraw.enableDoubleBuffering();
StdDraw.setPenColor(StdDraw.BOOK_BLUE);
for (int i = 0; i < data.length; i++) {
StdDraw.filledRectangle(i + 0.5 + xIncrement * data.length, yIncrement * data.length + data[i] / 2, 0.45, data[i] / 2);
StdDraw.show();
}
StdDraw.setPenColor(StdDraw.RED);
StdDraw.line(data.length + xIncrement * data.length + 0.005, 0,
data.length + xIncrement * data.length + 0.025, max * 3);
}
public static void main(String[] args) {
int trials = Integer.parseInt(args);
double[] diceData = new double;
Histogram histogram = new Histogram(diceData, (trials / 6) * 2);
StdDraw.setPenColor(StdDraw.BOOK_BLUE);
for (int t = 1; t <= trials; t++) {
double r = Math.random();
if (r < 1.0 / 6.0) histogram.addData(0);
else if (r < 2.0 / 6.0) histogram.addData(1);
else if (r < 3.0 / 6.0) histogram.addData(2);
else if (r < 4.0 / 6.0) histogram.addData(3);
else if (r < 5.0 / 6.0) histogram.addData(4);
else if (r < 6.0 / 6.0) histogram.addData(5);
histogram.draw(0, 0);
}
}
}
``````

``````public class Histograms {
private final Histogram[] histograms;
private final double max;
public Histograms(Histogram[] histograms, double max) {
this.histograms = histograms;
this.max = max;
StdDraw.setXscale(0, histograms.getData().length * histograms.length);
StdDraw.setYscale(0, max);
}
public void draw() {
int rows = histograms.length;
int columns = histograms.length;
for (int i = 0; i < columns; i++) {
if (rows % columns == 0) {
rows = rows / columns;
break;
} else {
rows++;
}
}
int m = 0;
for (int c = 0; c < columns; c++) {
for (int r = 0; r < rows; r++) {
histograms[m].draw(c, r);
m++;
}
}
}
public static void main(String[] args) {
int trials = Integer.parseInt(args);
double max = trials;
double[] probabilities = {
0.2,
0.4,
0.6,
0.8
};
double[][] diceData = new double;
Histogram[] histograms = new Histogram;
for (int i = 0; i < 4; i++) {
histograms[i] = new Histogram(diceData[i], max);
}
for (int t = 1; t <= trials; t++) {
Histograms multipleHistograms = new Histograms(histograms, max);
multipleHistograms.draw();
StdDraw.pause(20);
}
}
}
``````

I wanted `Histogram` to work also independently of `Histograms` and so I was forced to inject some redundancy into these two programs (for example they both use scaling but scaling within `Histograms` overrides the scaling within `Histogram`).

Here is one instance of `Histogram`:

Input: 100

Output: Here is one instance of `Histograms`:

Input: 100

Output: Is there any way that I can improve my programs?

Solution

A few remarks, mainly focussing on the `Histogram` class.

## Separation of concerns

Your application contains different parts: algorithmic parts (collecting histogram data), user interface parts (`draw()` methods), and `main()` methods.

You already separate these concerns into different methods, which is a good thing. But I recommend to go one step further, and to have different classes, e.g.:

• Histogram (for the algorithmic part),
• StdDrawHistogram (for presenting a histogram using `StdDraw`),
• HistogramApp (containing the `main()` method, wiring the parts together).

## Histogram class API

I’d change the public API of Histogram a bit:

You’re using a `double[]` array for counting. But you never do fractional counts, you always just add an integer `1` to the buckets. So, you should change that to `int[]` (or `long[]` in the unlikely case you expect more than two billion counts).

Your constructor `public Histogram(double[] data, double max)` forces the caller to prepare a `double[]` array of the appropriate dimension. This comes as a surprise to the caller. It should be enough to tell the `Histogram` the number of buckets, and setting up the counting structure (your array) should be the responsibility of the Histogram constructor.

## Encapsulation

You typically want to hide the internals (that you’re using an array instead of some other fancy data structure) from your callers. This way you’re free to change the internals later without impact on code using your class. And you make it impossible for external code to fiddle around with the internals, bypassing the API of your class. E.g. currently, somewhere in your `main()` method you can write things like `diceData = 7;`, and then the histogram data on that bucket gets overwritten. The only way of changing the data of your histogram should be by calling its methods.

Following this argument, I’d also replace the `public double[] getData()` method with two methods:

``````public int getNumberOfBuckets() {
return data.length;
}

public int getCountInBucket(int i) {
return data[i];
}
``````

You also force the constructor’s caller to provide a `max` value that can’t be known at that time, while the histogram is still empty. Maybe the name is misleading, and you don’t mean the maximum count in a bucket, but the drawing height. Then it doesn’t belong in the algorithmic part, but into the drawing part.

With the example of this `max` field, I strongly recommend to document the meaning of your code elements. At least to me, it’s unclear whether max is intended to be the maximum count found in the histogram, or a space to be reserved when drawing the histogram, or something else. Writing Javadoc comments for fields and methods helps you avoid such ambiguities. Especially if you read your code some months after the initial creation, you’ll be glad you documented it. Having to express your concepts in writing forces you to have the concepts clearly settled in your mind.
Another means of helping readability is the naming of code elements. You already did quite a good job there, only the name `max` is a little unclear, as discussed earlier.
You have a field named `max`. It’s only used inside the constructor, to set the scale for drawing, and not accessed anywhere else. So, holding it as information in the instance for all the lifetime of the instance after construction is useless, and you can eliminate the `private final double max;` field.
In the useful `findMax()` method, you have a local variable `max` collecting the algorithmic information “highest count in any bucket”. Having the same name as the field `max` confuses the reader. I first didn’t notice that these were two different things. I’d recommend not to re-use field names as local variables.