API Module

GriSPy has the following queries implemented:
- bubble_neighbors: find neighbors within a given radius. A different radius for each centre can be provided.
- shell_neighbors: find neighbors within given lower and upper radius. Different lower and upper radius can be provided for each centre.
- nearest_neighbors: find the nth nearest neighbors for each centre.
And the following methods are available:
- set_periodicity: define the periodicity conditions.

Module grispy.core

GriSPy core class.

class grispy.core.PeriodicityConf(periodic_flag, pd_hi, pd_low, periodic_edges, periodic_direc)[source]

Bases: object

Internal representation of the periodicity of the Grid.

class grispy.core.Grid(data=None, N_cells=64, copy_data=False)[source]

Bases: object

Grid indexing.

Grid is a regular grid indexing algorithm. This class indexes a set of k-dimensional points in a regular grid.

Parameters:
  • data (ndarray, shape(n,k)) – The n data points of dimension k to be indexed. This array is not copied, and so modifying this data may result in erroneous results. The data can be copied if the grid is built with copy_data=True.
  • N_cells (positive int, optional) – The number of cells of each dimension to build the grid. The final grid will have N_cells**k number of cells. Default: 64
  • copy_data (bool, optional) – Flag to indicate if the data should be copied in memory. Default: False
dim

The dimension of a single data-point.

Type:int
grid

This dictionary contains the data indexed in a grid. The key is a tuple with the k-dimensional index of each grid cell. Empty cells do not have a key. The value is a list of data points indices which are located within the given cell.

Type:dict
k_bins

The limits of the grid cells in each dimension.

Type:ndarray, shape (N_cells + 1, k)
edges

Grid edges or bound values. The lower and upper bounds per dimension.

Type:ndarray, shape (2, k)
epsilon

Value of increment used to create the grid edges.

Type:float
ndata

Total number of a data-points.

Type:int
shape

Number of cells per dimension.

Type:tuple
size

Total number of cells.

Type:int
cell_width

Cell size in each dimension.

Type:ndarray
dim

Grid dimension.

edges

Edges of the grid in each dimension.

epsilon

Epsilon used to expand the grid.

ndata

Total number of a data-points.

shape

Grid shape, i.e. number of cells per dimension.

size

Grid size, i.e. total number of cells.

cell_width

Cell size in each dimension.

contains(points)[source]

Check if points are contained within the grid.

Parameters:points (ndarray, shape (m,k)) – The point or points to check against the grid domain.
Returns:bool – Boolean array indicating if a point is contained within the grid.
Return type:ndarray, shape (m,)
cell_digits(points)[source]

Return grid cell indices for a given point.

Parameters:points (ndarray, shape (m,k)) – The point or points to calculate the cell indices.
Returns:digits – Array of cell indices with same shape as points. If a point is outside of the grid edges -1 is returned.
Return type:ndarray, shape (m,k)
cell_id(points)[source]

Return grid cell unique id for a given point.

Parameters:points (ndarray, shape (m,k)) – The point or points to calculate the cell unique id.
Returns:ids – Array of cell unique ids for each point. If a point is outside of the grid edges -1 is returned.
Return type:ndarray, shape (m,)
cell_digits2id(digits)[source]

Return unique id of cells given their digits.

Parameters:digits (ndarray, shape (m,k)) – Array of cell indices. Must be integers.
Returns:ids – Array of cell unique ids for each point.
Return type:ndarray, shape (m,)
cell_id2digits(ids)[source]

Return cell digits given their unique id.

Parameters:ids (ndarray, shape (m,)) – Array of cell unique ids for each point.
Returns:digits – Array of cell indices.
Return type:ndarray, shape (m,k)
cell_walls(digits)[source]

Return cell wall coordinates for given cell digits.

Parameters:digits (ndarray, shape (m,k)) – Array of cell indices. Must be integers.
Returns:
  • lower (ndarray, shape (m, 3)) – Lower cell wall values for each point.
  • upper (ndarray, shape (m, 3)) – Upper cell wall values for each point.
cell_centre(digits)[source]

Return cell centre coordinates for given cell digits.

Parameters:digits (ndarray, shape (m,k)) – Array of cell indices. Must be integers.
Returns:centres – Cell centre for each point.
Return type:ndarray, shape (m, k)
cell_count(digits)[source]

Return number of points within given cell digits.

Parameters:digits (ndarray, shape (m,k)) – Array of cell indices. Must be integers.
Returns:count – Cell count for each for each cell.
Return type:ndarray, shape (m,)
cell_points(digits)[source]

Return indices of points within given cell digits.

Parameters:digits (ndarray, shape (m,k)) – Array of cell indices. Must be integers.
Returns:points – List of m arrays. Each array has the indices to the neighbors of that cell.
Return type:list, length m
class grispy.core.GriSPy(data=None, N_cells=64, copy_data=False, periodic=NOTHING, metric='euclid')[source]

Bases: grispy.core.Grid

Grid Search in Python.

GriSPy is a regular grid search algorithm for quick nearest-neighbor lookup.

This class indexes a set of k-dimensional points in a regular grid providing a fast aproach for nearest neighbors queries. Optional periodic boundary conditions can be provided for each axis individually.

The algorithm has the following queries implemented: - bubble_neighbors: find neighbors within a given radius. A different radius for each centre can be provided. - shell_neighbors: find neighbors within given lower and upper radius. Different lower and upper radius can be provided for each centre. - nearest_neighbors: find the nth nearest neighbors for each centre.

Other methods: - set_periodicity: set periodicity condition after the grid was built.

To be implemented: - box_neighbors: find neighbors within a k-dimensional squared box of a given size and orientation. - n_jobs: number of cores for parallel computation.

Parameters:
  • data (ndarray, shape(n,k)) – The n data points of dimension k to be indexed. This array is not copied, and so modifying this data may result in erroneous results. The data can be copied if the grid is built with copy_data=True.
  • N_cells (positive int, optional) – The number of cells of each dimension to build the grid. The final grid will have N_cells**k number of cells. Default: 64
  • copy_data (bool, optional) – Flag to indicate if the data should be copied in memory. Default: False
  • periodic (dict, optional) – Dictionary indicating if the data domain is periodic in some or all its dimensions. The key is an integer that correspond to the number of dimensions in data, going from 0 to k-1. The value is a tuple with the domain limits and the data must be contained within these limits. If an axis is not specified, or if its value is None, it will be considered as non-periodic. Important: The periodicity only works within one periodic range. Default: all axis set to None. Example, periodic = { 0: (0, 360), 1: None}.
  • metric (str, optional) – Metric definition to compute distances. Options: ‘euclid’, ‘haversine’ ‘vincenty’ or a custom callable.
dim

The dimension of a single data-point.

Type:int
grid

This dictionary contains the data indexed in a grid. The key is a tuple with the k-dimensional index of each grid cell. Empty cells do not have a key. The value is a list of data points indices which are located within the given cell.

Type:dict
k_bins

The limits of the grid cells in each dimension.

Type:ndarray, shape (N_cells + 1, k)
edges

Grid edges or bound values. The lower and upper bounds per dimension.

Type:ndarray, shape (2, k)
epsilon

Value of increment used to create the grid edges.

Type:float
ndata

Total number of a data-points.

Type:int
shape

Number of cells per dimension.

Type:tuple
size

Total number of cells.

Type:int
cell_width

Cell size in each dimension.

Type:ndarray
periodic_flag

If any dimension has periodicity.

Type:bool
periodic_conf

Statistics and intermediate results to make easy and fast the searchs with periodicity.

Type:grispy.core.PeriodicityConf
periodic_flag

Proxy to periodic_conf_.periodic_flag.

set_periodicity(periodic, inplace=False)[source]

Set periodicity conditions.

This allows to define or change the periodicity limits without having to construct the grid again.

Important: The periodicity only works within one periodic range.

Parameters:
  • periodic (dict, optional) – Dictionary indicating if the data domain is periodic in some or all its dimensions. The key is an integer that corresponds to the number of dimensions in data, going from 0 to k-1. The value is a tuple with the domain limits and the data must be contained within these limits. If an axis is not specified, or if its value is None, it will be considered as non-periodic. Default: all axis set to None. Example, periodic = { 0: (0, 360), 1: None}.
  • inplace (boolean, optional (default=False)) – If its True, set the periodicity on the current GriSPy instance and return None. Otherwise a new instance is created and returned.
bubble_neighbors(centres, distance_upper_bound=-1.0, sorted=False, kind='quicksort')[source]

Find all points within given distances of each centre.

Parameters:
  • centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
  • distance_upper_bound (scalar or ndarray of length m) – The radius of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
  • sorted (bool, optional) – If True the returned neighbors will be ordered by increasing distance to the centre. Default: False.
  • kind (str, optional) – When sorted = True, the sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
  • njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns:

  • distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
  • indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.

shell_neighbors(centres, distance_lower_bound=-1.0, distance_upper_bound=-1.0, sorted=False, kind='quicksort')[source]

Find all points within given lower and upper distances of each centre.

The distance condition is:
distance_lower_bound <= distance < distance_upper_bound
Parameters:
  • centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
  • distance_lower_bound (scalar or ndarray of length m) – The minimum distance of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
  • distance_upper_bound (scalar or ndarray of length m) – The maximum distance of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
  • sorted (bool, optional) – If True the returned neighbors will be ordered by increasing distance to the centre. Default: False.
  • kind (str, optional) – When sorted = True, the sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
  • njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns:

  • distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
  • indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.

nearest_neighbors(centres, n=1, kind='quicksort')[source]

Find the n nearest-neighbors for each centre.

Parameters:
  • centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
  • n (int, optional) – The number of neighbors to fetch for each centre. Default: 1.
  • kind (str, optional) – The returned neighbors will be ordered by increasing distance to the centre. The sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
  • njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns:

  • distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
  • indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.

Module grispy.distances

Distances implementations for GriSPy.

grispy.distances.euclid(c0, centres, dim)[source]

Classic Euclidean distance.

In mathematics, the Euclidean distance or Euclidean metric is the “ordinary” straight-line distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space. The associated norm is called the Euclidean norm. Older literature refers to the metric as the Pythagorean metric. A generalized term for the Euclidean norm is the L2 norm or L2 distance. More info: https://en.wikipedia.org/wiki/Euclidean_distance

grispy.distances.haversine(c0, centres, dim)[source]

Distance using the Haversine formulae.

The haversine formula determines the great-circle distance between two points on a sphere given their longitudes and latitudes. Important in navigation, it is a special case of a more general formula in spherical trigonometry, the law of haversines, that relates the sides and angles of spherical triangles. More info: https://en.wikipedia.org/wiki/Haversine_formula

grispy.distances.vincenty(c0, centres, dim)[source]

Calculate distance on the surface of a spheroid with Vincenty Formulae.

Vincenty’s formulae are two related iterative methods used in geodesy to calculate the distance between two points on the surface of a spheroid, developed by Thaddeus Vincenty (1975a). They are based on the assumption that the figure of the Earth is an oblate spheroid, and hence are more accurate than methods that assume a spherical Earth, such as great-circle distance. More info: https://en.wikipedia.org/wiki/Vincenty%27s_formulae

Module grispy.validators

Functions to validate GriSPy input parameters.

grispy.validators.validate_digits(digits, N_cells)[source]

Validate method params: digits.

grispy.validators.validate_ids(ids, size)[source]

Validate method params: ids.

grispy.validators.validate_centres(centres, data)[source]

Validate method params: centres.

grispy.validators.validate_equalsize(a, b)[source]

Check if two arrays have the same lenght.

grispy.validators.validate_distance_bound(distance, periodic)[source]

Distance bounds, upper and lower, can be scalar or numpy array.

grispy.validators.validate_shell_distances(lower_bound, upper_bound, periodic)[source]

Distance bounds, upper and lower, can be scalar or numpy array.

grispy.validators.validate_bool(flag)[source]

Check if bool.

grispy.validators.validate_sortkind(kind)[source]

Define valid sorting algorithm names.

grispy.validators.validate_n_nearest(n, data, periodic)[source]

Validate method params: n_nearest.