- the Distances vignette now has a fixed documentation for the benchmarking of low-level distance functions. Many thanks to (@Nowosad) #30
- in
`../src/correlation.h`

adjustment of use of logical operators rather than Wbitwise (`| -> or`

) which otherwises raises warnings in`clang14`

- vector element limit is now extended to long vectors for all
distance measures by declaring
`R_xlen_t`

instead of`int`

during indexing.

`distance()`

and all other individual information theory functions receive a new argument`epsilon`

with default value`epsilon = 0.00001`

to treat cases where in individual distance or similarity computations yield`x / 0`

or`0 / 0`

. Instead of a hard coded epsilon, users can now set`epsilon`

according to their input vectors. (Many thanks to Joshua McNeill #26 for this great question).- three new functions
`dist_one_one()`

,`dist_one_many()`

,`dist_many_many()`

are added. They are fairly flexible intermediaries between`distance()`

and single distance functions.`dist_one_one()`

expects two vectors (probability density functions) and returns a single value.`dist_one_many()`

expects one vector (a probability density function) and one matrix (a set of probability density functions), and returns a vector of values.`dist_many_many()`

expects two matrices (two sets of probability density functions), and returns a matrix of values. (Many thanks to Jakub Nowosad, see #27, #28, and New Vignette Many_Distance)

- a new Vignette Comparing many probability density functions (Many thanks to Jakub Nowosad)
`dplyr`

package dependency was removed and replaced by the`poorman`

due to the heavy dependency burden of`dplyr`

, since`philentropy`

only used`dplyr::between()`

which is now`poorman::between()`

(Many thanks to Patrice Kiener for this suggestion)`distance(..., as.dist.obj = TRUE)`

now returns the same values as`stats::dist()`

when working with 2 dimensional input matrices (2 vector inputs) (see #29) (Many thanks to Jakub Nowosad (@Nowosad)) Example:

```
library(philentropy)
= matrix(c(1, 2), ncol = 1)
m1
dist(m1)
#> 1
#> 2 1
distance(m1, as.dist.obj = TRUE)
#> Metric: 'euclidean'; comparing: 2 vectors.
#> 1
#> 2 1
```

- the
`distance()`

function receives a new argument`mute.message`

allowing users to mute message printing when running large-scale distance computations. Example:

```
distance(rbind(1:10/sum(1:10), 20:29/sum(20:29)),
method = "euclidean",
mute.message = TRUE)
```

- adding
`markdown`

dependency to`DESCRIPTION`

(find details here)

the

`distance()`

function receives a new argument`use.row.names`

to enable passing the row names from the input probability or count matrix to the output distance matrixthe

`distance()`

function can now handle`data.table`

and`tibble`

input #16adding new functionality and arguments

`as.dist.obj`

,`diag`

, and`upper`

to`philentropy::distance()`

to allow users to retrieve a`stats::dist()`

object when working with`philentropy::distance()`

(Many thanks to Hugo Tavares #18 - see also #13) When using`philentropy::distance(..., as.dist.obj = TRUE)`

users can now directly pass the`distance()`

output into`hclust`

:

Before:

```
<- rbind(1:10/sum(1:10), 20:29/sum(20:29),30:39/sum(30:39))
ProbMatrix <- distance(ProbMatrix, method = "jaccard")
dist.mat <- as.dist(dist.mat)
true.dist.mat <- hclust(true.dist.mat, method = "complete")
clust.res clust.res
```

```
Call:
hclust(d = true.dist.mat, method = "complete")
Cluster method : complete
Number of objects: 3
```

Now:

```
<- rbind(1:10/sum(1:10), 20:29/sum(20:29),30:39/sum(30:39))
ProbMatrix <- distance(ProbMatrix, method = "jaccard", as.dist.obj = TRUE)
dist.mat <- hclust(true.dist.mat, method = "complete")
clust.res clust.res
```

```
Call:
hclust(d = true.dist.mat, method = "complete")
Cluster method : complete
Number of objects: 3
```

- fixing a bug in
`gJSD()`

which tested transposed matrix rows rather than transposed matrix columns for sum > 1 (see issue #17 ; many thanks to @wkc1986)

- exporting all Rcpp distance measure functions individually (see issue #9), this enables access to much faster computations (see micro benchmarks at https://hajkd.github.io/philentropy/articles/Distances.html)

fixing bug which caused that KL distance returns NaN when P == 0 (see issue #10; Many thanks to @KaiserDominici)

fixing bug which caused stack overflow when computing distance matrices with many rows (see issue #7; Many thanks to @wkc1986 and @elbamos)

fixing bug in

`gJSD()`

where an`rbind()`

input matrix is not properly transposed (Many thanks to @vrodriguezf; see issue #14)

`gJSD()`

receives new argument`est.prob`

to enable empirical estimation of probability vectors from input count vectors (non-probabilistic vectors)Jaccard and Tanimoto similarity measures now return

`0`

instead of`NAN`

when probability vectors contain zeros (Many thanks to @JonasMandel; see issue #15)

- Fixing bug that caused
`jensen-shannon`

computations to compute wrong values when`0 values`

were present in the input vectors (see issue #4 ; Many thanks to @wkc1986) - Fixing bug that caused
`jensen-difference`

computations to compute wrong values when`0 values`

were present in the input vectors - Fixing bugs in all distance metrics when handing 0/0, 0/x or x/0 cases

- new message system
- extending documentation

- Fixing bug that caused that
`JSD()`

gives NaN when any probability is 0 - see https://github.com/HajkD/philentropy/issues/1 (Thanks to William Kurtis Chang)

- Fixing C++ memory leaks in
`dist.diversity()`

and`distance()`

when check for`colSums(x) > 1.001`

was peformed (leak was found with`rhub::check_with_valgrind()`

)

Initial submission version.