API Module

GriSPy has the following queries implemented:
- bubble_neighbors: find neighbors within a given radius. A different radius for each centre can be provided.
- shell_neighbors: find neighbors within given lower and upper radius. Different lower and upper radius can be provided for each centre.
- nearest_neighbors: find the nth nearest neighbors for each centre.
And the following methods are available:
- set_periodicity: define the periodicity conditions.

Module grispy.core

GriSPy core class.

class grispy.core.BuildStats(buildtime, periodicity_set_at, datetime)[source]

Bases: object

Statistics about the grid creation.

buildtime

The number of seconds expended in build the grid.

Type:float
periodicity_set_at

The date and time when the periodicity was setted.

Type:datetime.datetime
datetime

The date and time of build.

Type:datetime.datetime
class grispy.core.PeriodicityConf(periodic_flag, pd_hi, pd_low, periodic_edges, periodic_direc)[source]

Bases: object

Internal representation of the periodicity of the Grid.

class grispy.core.GriSPy(data=None, N_cells=64, periodic=NOTHING, metric='euclid', copy_data=False)[source]

Bases: object

Grid Search in Python.

GriSPy is a regular grid search algorithm for quick nearest-neighbor lookup.

This class indexes a set of k-dimensional points in a regular grid providing a fast aproach for nearest neighbors queries. Optional periodic boundary conditions can be provided for each axis individually.

The algorithm has the following queries implemented: - bubble_neighbors: find neighbors within a given radius. A different radius for each centre can be provided. - shell_neighbors: find neighbors within given lower and upper radius. Different lower and upper radius can be provided for each centre. - nearest_neighbors: find the nth nearest neighbors for each centre.

Other methods: - set_periodicity: set periodicity condition after the grid was built.

To be implemented: - box_neighbors: find neighbors within a k-dimensional squared box of a given size and orientation. - n_jobs: number of cores for parallel computation.

Parameters:
  • data (ndarray, shape(n,k)) – The n data points of dimension k to be indexed. This array is not copied, and so modifying this data may result in erroneous results. The data can be copied if the grid is built with copy_data=True.
  • N_cells (positive int, optional) – The number of cells of each dimension to build the grid. The final grid will have N_cells**k number of cells. Default: 64
  • copy_data (bool, optional) – Flag to indicate if the data should be copied in memory. Default: False
  • periodic (dict, optional) – Dictionary indicating if the data domain is periodic in some or all its dimensions. The key is an integer that correspond to the number of dimensions in data, going from 0 to k-1. The value is a tuple with the domain limits and the data must be contained within these limits. If an axis is not specified, or if its value is None, it will be considered as non-periodic. Important: The periodicity only works within one periodic range. Default: all axis set to None. Example, periodic = { 0: (0, 360), 1: None}.
  • metric (str, optional) – Metric definition to compute distances. Options: ‘euclid’, ‘haversine’ ‘vincenty’ or a custom callable.
dim

The dimension of a single data-point.

Type:int
grid_

This dictionary contains the data indexed in a grid. The key is a tuple with the k-dimensional index of each grid cell. Empty cells do not have a key. The value is a list of data points indices which are located within the given cell.

Type:dict
k_bins_

The limits of the grid cells in each dimension.

Type:ndarray, shape (N_cells+1,k)
periodic_flag_

If any dimension has periodicity.

Type:bool
periodic_conf_

Statistics and intermediate results to make easy and fast the searchs with periodicity.

Type:grispy.core.PeriodicityConf
time_

Object containing the building time and the date of build.

Type:grispy.core.BuildStats
periodic_flag_

Proxy to periodic_conf_.periodic_flag.

set_periodicity(periodic, inplace=False)[source]

Set periodicity conditions.

This allows to define or change the periodicity limits without having to construct the grid again.

Important: The periodicity only works within one periodic range.

Parameters:
  • periodic (dict, optional) – Dictionary indicating if the data domain is periodic in some or all its dimensions. The key is an integer that corresponds to the number of dimensions in data, going from 0 to k-1. The value is a tuple with the domain limits and the data must be contained within these limits. If an axis is not specified, or if its value is None, it will be considered as non-periodic. Default: all axis set to None. Example, periodic = { 0: (0, 360), 1: None}.
  • inplace (boolean, optional (default=False)) – If its True, set the periodicity on the current GriSPy instance and return None. Otherwise a new instance is created and returned.
bubble_neighbors(centres, distance_upper_bound=-1.0, sorted=False, kind='quicksort')[source]

Find all points within given distances of each centre.

Parameters:
  • centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
  • distance_upper_bound (scalar or ndarray of length m) – The radius of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
  • sorted (bool, optional) – If True the returned neighbors will be ordered by increasing distance to the centre. Default: False.
  • kind (str, optional) – When sorted = True, the sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
  • njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns:

  • distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
  • indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.

shell_neighbors(centres, distance_lower_bound=-1.0, distance_upper_bound=-1.0, sorted=False, kind='quicksort')[source]

Find all points within given lower and upper distances of each centre.

Parameters:
  • centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
  • distance_lower_bound (scalar or ndarray of length m) – The minimum distance of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
  • distance_upper_bound (scalar or ndarray of length m) – The maximum distance of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
  • sorted (bool, optional) – If True the returned neighbors will be ordered by increasing distance to the centre. Default: False.
  • kind (str, optional) – When sorted = True, the sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
  • njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns:

  • distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
  • indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.

nearest_neighbors(centres, n=1, kind='quicksort')[source]

Find the n nearest-neighbors for each centre.

Parameters:
  • centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
  • n (int, optional) – The number of neighbors to fetch for each centre. Default: 1.
  • kind (str, optional) – The returned neighbors will be ordered by increasing distance to the centre. The sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
  • njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns:

  • distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
  • indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.

Module grispy.distances

Distances implementations for GriSPy.

grispy.distances.euclid(c0, centres, dim)[source]

Classic Euclidean distance.

In mathematics, the Euclidean distance or Euclidean metric is the “ordinary” straight-line distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space. The associated norm is called the Euclidean norm. Older literature refers to the metric as the Pythagorean metric. A generalized term for the Euclidean norm is the L2 norm or L2 distance. More info: https://en.wikipedia.org/wiki/Euclidean_distance

grispy.distances.haversine(c0, centres, dim)[source]

Distance using the Haversine formulae.

The haversine formula determines the great-circle distance between two points on a sphere given their longitudes and latitudes. Important in navigation, it is a special case of a more general formula in spherical trigonometry, the law of haversines, that relates the sides and angles of spherical triangles. More info: https://en.wikipedia.org/wiki/Haversine_formula

grispy.distances.vincenty(c0, centres, dim)[source]

Calculate distance on the surface of a spheroid with Vincenty Formulae.

Vincenty’s formulae are two related iterative methods used in geodesy to calculate the distance between two points on the surface of a spheroid, developed by Thaddeus Vincenty (1975a). They are based on the assumption that the figure of the Earth is an oblate spheroid, and hence are more accurate than methods that assume a spherical Earth, such as great-circle distance. More info: https://en.wikipedia.org/wiki/Vincenty%27s_formulae

Module grispy.validators

Functions to validate GriSPy input parameters.

grispy.validators.validate_centres(centres, data)[source]

Validate method params: centres.

grispy.validators.validate_equalsize(a, b)[source]

Check if two arrays have the same lenght.

grispy.validators.validate_distance_bound(distance, periodic)[source]

Distance bounds, upper and lower, can be scalar or numpy array.

grispy.validators.validate_shell_distances(lower_bound, upper_bound, periodic)[source]

Distance bounds, upper and lower, can be scalar or numpy array.

grispy.validators.validate_bool(flag)[source]

Check if bool.

grispy.validators.validate_sortkind(kind)[source]

Define valid sorting algorithm names.

grispy.validators.validate_n_nearest(n, data, periodic)[source]

Validate method params: n_nearest.