API Module¶
Module grispy.core
¶
GriSPy core class.
-
class
grispy.core.
PeriodicityConf
(periodic_flag, pd_hi, pd_low, periodic_edges, periodic_direc)[source]¶ Bases:
object
Internal representation of the periodicity of the Grid.
-
class
grispy.core.
Grid
(data=None, N_cells=64, copy_data=False)[source]¶ Bases:
object
Grid indexing.
Grid is a regular grid indexing algorithm. This class indexes a set of k-dimensional points in a regular grid.
Parameters: - data (ndarray, shape(n,k)) – The n data points of dimension k to be indexed. This array is not copied, and so modifying this data may result in erroneous results. The data can be copied if the grid is built with copy_data=True.
- N_cells (positive int, optional) – The number of cells of each dimension to build the grid. The final grid will have N_cells**k number of cells. Default: 64
- copy_data (bool, optional) – Flag to indicate if the data should be copied in memory. Default: False
-
dim
¶ The dimension of a single data-point.
Type: int
-
grid
¶ This dictionary contains the data indexed in a grid. The key is a tuple with the k-dimensional index of each grid cell. Empty cells do not have a key. The value is a list of data points indices which are located within the given cell.
Type: dict
-
k_bins
¶ The limits of the grid cells in each dimension.
Type: ndarray, shape (N_cells + 1, k)
-
edges
¶ Grid edges or bound values. The lower and upper bounds per dimension.
Type: ndarray, shape (2, k)
-
epsilon
¶ Value of increment used to create the grid edges.
Type: float
-
ndata
¶ Total number of a data-points.
Type: int
-
shape
¶ Number of cells per dimension.
Type: tuple
-
size
¶ Total number of cells.
Type: int
-
cell_width
¶ Cell size in each dimension.
Type: ndarray
-
dim
Grid dimension.
-
edges
Edges of the grid in each dimension.
-
epsilon
Epsilon used to expand the grid.
-
ndata
Total number of a data-points.
-
shape
Grid shape, i.e. number of cells per dimension.
-
size
Grid size, i.e. total number of cells.
-
cell_width
Cell size in each dimension.
-
contains
(points)[source]¶ Check if points are contained within the grid.
Parameters: points (ndarray, shape (m,k)) – The point or points to check against the grid domain. Returns: bool – Boolean array indicating if a point is contained within the grid. Return type: ndarray, shape (m,)
-
cell_digits
(points)[source]¶ Return grid cell indices for a given point.
Parameters: points (ndarray, shape (m,k)) – The point or points to calculate the cell indices. Returns: digits – Array of cell indices with same shape as points. If a point is outside of the grid edges -1 is returned. Return type: ndarray, shape (m,k)
-
cell_id
(points)[source]¶ Return grid cell unique id for a given point.
Parameters: points (ndarray, shape (m,k)) – The point or points to calculate the cell unique id. Returns: ids – Array of cell unique ids for each point. If a point is outside of the grid edges -1 is returned. Return type: ndarray, shape (m,)
-
cell_digits2id
(digits)[source]¶ Return unique id of cells given their digits.
Parameters: digits (ndarray, shape (m,k)) – Array of cell indices. Must be integers. Returns: ids – Array of cell unique ids for each point. Return type: ndarray, shape (m,)
-
cell_id2digits
(ids)[source]¶ Return cell digits given their unique id.
Parameters: ids (ndarray, shape (m,)) – Array of cell unique ids for each point. Returns: digits – Array of cell indices. Return type: ndarray, shape (m,k)
-
cell_walls
(digits)[source]¶ Return cell wall coordinates for given cell digits.
Parameters: digits (ndarray, shape (m,k)) – Array of cell indices. Must be integers. Returns: - lower (ndarray, shape (m, 3)) – Lower cell wall values for each point.
- upper (ndarray, shape (m, 3)) – Upper cell wall values for each point.
-
cell_centre
(digits)[source]¶ Return cell centre coordinates for given cell digits.
Parameters: digits (ndarray, shape (m,k)) – Array of cell indices. Must be integers. Returns: centres – Cell centre for each point. Return type: ndarray, shape (m, k)
-
class
grispy.core.
GriSPy
(data=None, N_cells=64, copy_data=False, periodic=NOTHING, metric='euclid')[source]¶ Bases:
grispy.core.Grid
Grid Search in Python.
GriSPy is a regular grid search algorithm for quick nearest-neighbor lookup.
This class indexes a set of k-dimensional points in a regular grid providing a fast aproach for nearest neighbors queries. Optional periodic boundary conditions can be provided for each axis individually.
The algorithm has the following queries implemented: - bubble_neighbors: find neighbors within a given radius. A different radius for each centre can be provided. - shell_neighbors: find neighbors within given lower and upper radius. Different lower and upper radius can be provided for each centre. - nearest_neighbors: find the nth nearest neighbors for each centre.
Other methods: - set_periodicity: set periodicity condition after the grid was built.
To be implemented: - box_neighbors: find neighbors within a k-dimensional squared box of a given size and orientation. - n_jobs: number of cores for parallel computation.
Parameters: - data (ndarray, shape(n,k)) – The n data points of dimension k to be indexed. This array is not copied, and so modifying this data may result in erroneous results. The data can be copied if the grid is built with copy_data=True.
- N_cells (positive int, optional) – The number of cells of each dimension to build the grid. The final grid will have N_cells**k number of cells. Default: 64
- copy_data (bool, optional) – Flag to indicate if the data should be copied in memory. Default: False
- periodic (dict, optional) – Dictionary indicating if the data domain is periodic in some or all its dimensions. The key is an integer that correspond to the number of dimensions in data, going from 0 to k-1. The value is a tuple with the domain limits and the data must be contained within these limits. If an axis is not specified, or if its value is None, it will be considered as non-periodic. Important: The periodicity only works within one periodic range. Default: all axis set to None. Example, periodic = { 0: (0, 360), 1: None}.
- metric (str, optional) – Metric definition to compute distances. Options: ‘euclid’, ‘haversine’ ‘vincenty’ or a custom callable.
-
dim
¶ The dimension of a single data-point.
Type: int
-
grid
¶ This dictionary contains the data indexed in a grid. The key is a tuple with the k-dimensional index of each grid cell. Empty cells do not have a key. The value is a list of data points indices which are located within the given cell.
Type: dict
-
k_bins
¶ The limits of the grid cells in each dimension.
Type: ndarray, shape (N_cells + 1, k)
-
edges
¶ Grid edges or bound values. The lower and upper bounds per dimension.
Type: ndarray, shape (2, k)
-
epsilon
¶ Value of increment used to create the grid edges.
Type: float
-
ndata
¶ Total number of a data-points.
Type: int
-
shape
¶ Number of cells per dimension.
Type: tuple
-
size
¶ Total number of cells.
Type: int
-
cell_width
¶ Cell size in each dimension.
Type: ndarray
-
periodic_flag
¶ If any dimension has periodicity.
Type: bool
-
periodic_conf
¶ Statistics and intermediate results to make easy and fast the searchs with periodicity.
Type: grispy.core.PeriodicityConf
-
periodic_flag
Proxy to
periodic_conf_.periodic_flag
.
-
set_periodicity
(periodic, inplace=False)[source]¶ Set periodicity conditions.
This allows to define or change the periodicity limits without having to construct the grid again.
Important: The periodicity only works within one periodic range.
Parameters: - periodic (dict, optional) – Dictionary indicating if the data domain is periodic in some or all its dimensions. The key is an integer that corresponds to the number of dimensions in data, going from 0 to k-1. The value is a tuple with the domain limits and the data must be contained within these limits. If an axis is not specified, or if its value is None, it will be considered as non-periodic. Default: all axis set to None. Example, periodic = { 0: (0, 360), 1: None}.
- inplace (boolean, optional (default=False)) – If its True, set the periodicity on the current GriSPy instance and return None. Otherwise a new instance is created and returned.
-
bubble_neighbors
(centres, distance_upper_bound=-1.0, sorted=False, kind='quicksort')[source]¶ Find all points within given distances of each centre.
Parameters: - centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
- distance_upper_bound (scalar or ndarray of length m) – The radius of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
- sorted (bool, optional) – If True the returned neighbors will be ordered by increasing distance to the centre. Default: False.
- kind (str, optional) – When sorted = True, the sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
- njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns: - distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
- indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.
-
shell_neighbors
(centres, distance_lower_bound=-1.0, distance_upper_bound=-1.0, sorted=False, kind='quicksort')[source]¶ Find all points within given lower and upper distances of each centre.
- The distance condition is:
- distance_lower_bound <= distance < distance_upper_bound
Parameters: - centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
- distance_lower_bound (scalar or ndarray of length m) – The minimum distance of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
- distance_upper_bound (scalar or ndarray of length m) – The maximum distance of points to return. If a scalar is provided, the same distance will apply for every centre. An ndarray with individual distances can also be rovided.
- sorted (bool, optional) – If True the returned neighbors will be ordered by increasing distance to the centre. Default: False.
- kind (str, optional) – When sorted = True, the sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
- njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns: - distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
- indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.
-
nearest_neighbors
(centres, n=1, kind='quicksort')[source]¶ Find the n nearest-neighbors for each centre.
Parameters: - centres (ndarray, shape (m,k)) – The point or points to search for neighbors of.
- n (int, optional) – The number of neighbors to fetch for each centre. Default: 1.
- kind (str, optional) – The returned neighbors will be ordered by increasing distance to the centre. The sorting algorithm can be specified in this keyword. Available algorithms are: [‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’]. Default: ‘quicksort’
- njobs (int, optional) – Number of jobs for parallel computation. Not implemented yet.
Returns: - distances (list, length m) – Returns a list of m arrays. Each array has the distances to the neighbors of that centre.
- indices (list, length m) – Returns a list of m arrays. Each array has the indices to the neighbors of that centre.
Module grispy.distances
¶
Distances implementations for GriSPy.
-
grispy.distances.
euclid
(c0, centres, dim)[source]¶ Classic Euclidean distance.
In mathematics, the Euclidean distance or Euclidean metric is the “ordinary” straight-line distance between two points in Euclidean space. With this distance, Euclidean space becomes a metric space. The associated norm is called the Euclidean norm. Older literature refers to the metric as the Pythagorean metric. A generalized term for the Euclidean norm is the L2 norm or L2 distance. More info: https://en.wikipedia.org/wiki/Euclidean_distance
-
grispy.distances.
haversine
(c0, centres, dim)[source]¶ Distance using the Haversine formulae.
The haversine formula determines the great-circle distance between two points on a sphere given their longitudes and latitudes. Important in navigation, it is a special case of a more general formula in spherical trigonometry, the law of haversines, that relates the sides and angles of spherical triangles. More info: https://en.wikipedia.org/wiki/Haversine_formula
-
grispy.distances.
vincenty
(c0, centres, dim)[source]¶ Calculate distance on the surface of a spheroid with Vincenty Formulae.
Vincenty’s formulae are two related iterative methods used in geodesy to calculate the distance between two points on the surface of a spheroid, developed by Thaddeus Vincenty (1975a). They are based on the assumption that the figure of the Earth is an oblate spheroid, and hence are more accurate than methods that assume a spherical Earth, such as great-circle distance. More info: https://en.wikipedia.org/wiki/Vincenty%27s_formulae
Module grispy.validators
¶
Functions to validate GriSPy input parameters.
-
grispy.validators.
validate_distance_bound
(distance, periodic)[source]¶ Distance bounds, upper and lower, can be scalar or numpy array.