dbscan - Density-Based Spatial Clustering of Applications with Noise (DBSCAN) and Related Algorithms
A fast reimplementation of several density-based algorithms of the DBSCAN family. Includes the clustering algorithms DBSCAN (density-based spatial clustering of applications with noise) and HDBSCAN (hierarchical DBSCAN), the ordering algorithm OPTICS (ordering points to identify the clustering structure), shared nearest neighbor clustering, and the outlier detection algorithms LOF (local outlier factor) and GLOSH (global-local outlier score from hierarchies). The implementations use the kd-tree data structure (from library ANN) for faster k-nearest neighbor search. An R interface to fast kNN and fixed-radius NN search is also provided. Hahsler, Piekenbrock and Doran (2019) <doi:10.18637/jss.v091.i01>.
Last updated 6 days ago
clusteringdbscandensity-based-clusteringhdbscanlofopticscpp
15.54 score 315 stars 82 dependents 1.4k scripts 35k downloadsarules - Mining Association Rules and Frequent Itemsets
Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules). Also provides C implementations of the association mining algorithms Apriori and Eclat. Hahsler, Gruen and Hornik (2005) <doi:10.18637/jss.v014.i15>.
Last updated 4 days ago
arulesassociation-rulesfrequent-itemsets
14.19 score 193 stars 28 dependents 3.2k scripts 37k downloadsseriation - Infrastructure for Ordering Objects Using Seriation
Infrastructure for ordering objects with an implementation of several seriation/sequencing/ordination techniques to reorder matrices, dissimilarity matrices, and dendrograms. Also provides (optimally) reordered heatmaps, color images and clustering visualizations like dissimilarity plots, and visual assessment of cluster tendency plots (VAT and iVAT). Hahsler et al (2008) <doi:10.18637/jss.v025.i03>.
Last updated 13 days ago
combinatorial-optimizationordinationseriationfortran
14.05 score 75 stars 82 dependents 568 scripts 33k downloadsTSP - Infrastructure for the Traveling Salesperson Problem
Basic infrastructure and some algorithms for the traveling salesperson problem (also traveling salesman problem; TSP). The package provides some simple algorithms and an interface to the Concorde TSP solver and its implementation of the Chained-Lin-Kernighan heuristic. The code for Concorde itself is not included in the package and has to be obtained separately. Hahsler and Hornik (2007) <doi:10.18637/jss.v023.i02>.
Last updated 3 months ago
concorde-tsp-solvertsp
12.40 score 64 stars 102 dependents 304 scripts 28k downloadsarulesViz - Visualizing Association Rules and Frequent Itemsets
Extends package 'arules' with various visualization techniques for association rules and itemsets. The package also includes several interactive visualizations for rule exploration. Michael Hahsler (2017) <doi:10.32614/RJ-2017-047>.
Last updated 4 months ago
arulesassociation-rulesfrequent-itemsetsinteractive-visualizationsvisualization
11.15 score 53 stars 2 dependents 1.6k scripts 18k downloadsrecommenderlab - Lab for Developing and Testing Recommender Algorithms
Provides a research infrastructure to develop and evaluate collaborative filtering recommender algorithms. This includes a sparse representation for user-item matrices, many popular algorithms, top-N recommendations, and cross-validation. Hahsler (2022) <doi:10.48550/arXiv.2205.12371>.
Last updated 4 months ago
collaborative-filteringrecommender-system
10.47 score 213 stars 2 dependents 868 scripts 3.8k downloadsstream - Infrastructure for Data Stream Mining
A framework for data stream modeling and associated data mining tasks such as clustering and classification. The development of this package was supported in part by NSF IIS-0948893, NSF CMMI 1728612, and NIH R21HG005912. Hahsler et al (2017) <doi:10.18637/jss.v076.i14>.
Last updated 4 months ago
data-stream-clusteringdatastreamstream-miningcpp
9.77 score 38 stars 3 dependents 134 scripts 1.1k downloadsrBLAST - R Interface for the Basic Local Alignment Search Tool
Seamlessly interfaces the Basic Local Alignment Search Tool (BLAST) to search genetic sequence data bases. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Last updated 2 months ago
geneticssequencingsequencematchingalignmentdataimportbioconductorbioinformaticsblast-search
8.04 score 105 stars 100 scripts 258 downloadsrBLAST - R Interface for the Basic Local Alignment Search Tool
Seamlessly interfaces the Basic Local Alignment Search Tool (BLAST) to search genetic sequence data bases. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Last updated 8 months ago
geneticssequencingsequencematchingalignmentdataimportbioconductorbioinformaticsblast-search
7.98 score 105 stars 100 scriptsqap - Heuristics for the Quadratic Assignment Problem (QAP)
Implements heuristics for the Quadratic Assignment Problem (QAP). Although, the QAP was introduced as a combinatorial optimization problem for the facility location problem in operations research, it also has many applications in data analysis. The problem is NP-hard and the package implements a simulated annealing heuristic.
Last updated 4 months ago
combinatorial-optimizationheuristicqapquadratic-assignment-problemfortran
7.66 score 5 stars 82 dependents 7 scripts 25k downloadspomdp - Infrastructure for Partially Observable Markov Decision Processes (POMDP)
Provides the infrastructure to define and analyze the solutions of Partially Observable Markov Decision Process (POMDP) models. Interfaces for various exact and approximate solution algorithms are available including value iteration, point-based value iteration and SARSOP. Smallwood and Sondik (1973) <doi:10.1287/opre.21.5.1071>.
Last updated 14 days ago
control-theorymarkov-decision-processesoptimizationcpp
7.29 score 18 stars 21 scripts 771 downloadsstreamMOA - Interface for MOA Stream Clustering Algorithms
Interface for data stream clustering algorithms implemented in the MOA (Massive Online Analysis) framework (Albert Bifet, Geoff Holmes, Richard Kirkby, Bernhard Pfahringer (2010). MOA: Massive Online Analysis, Journal of Machine Learning Research 11: 1601-1604).
Last updated 4 months ago
clusteringdataminingdatastreamopenjdk
5.95 score 12 stars 37 scripts 742 downloadsarulesCBA - Classification Based on Association Rules
Provides the infrastructure for association rule-based classification including the algorithms CBA, CMAR, CPAR, C4.5, FOIL, PART, PRM, RCAR, and RIPPER to build associative classifiers. Hahsler et al (2019) <doi:10.32614/RJ-2019-048>.
Last updated 4 months ago
association-rulesclassification
5.37 score 2 stars 1 dependents 43 scripts 2.3k downloadsrRDP - Interface to the RDP Classifier
This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.
Last updated 8 months ago
geneticssequencinginfrastructureclassificationmicrobiomeimmunooncologyalignmentsequencematchingdataimportbayesianbioconductorbioinformaticsopenjdk
5.18 score 1 stars 3 scriptscba - Clustering for Business Analytics
Implements clustering techniques such as Proximus and Rock, utility functions for efficient computation of cross distances and data manipulation.
Last updated 4 months ago
4.86 score 3 dependents 171 scripts 2.6k downloadsmarkovDP - Infrastructure for Discrete-Time Markov Decision Processes (MDP)
Provides the infrastructure to work with Markov Decision Processes (MDPs) in R. The focus is on convenience in formulating MDPs, the support of sparse representations (using sparse matrices, lists and data.frames) and visualization of results. Some key components are implemented in C++ to speed up computation. Several popular solvers are implemented.
Last updated 27 days ago
control-theorymarkov-decision-processoptimization
4.85 score 5 stars 1 scriptsrEMM - Extensible Markov Model for Modelling Temporal Relationships Between Clusters
Implements TRACDS (Temporal Relationships between Clusters for Data Streams), a generalization of Extensible Markov Model (EMM). TRACDS adds a temporal or order model to data stream clustering by superimposing a dynamically adapting Markov Chain. Also provides an implementation of EMM (TRACDS on top of tNN data stream clustering). Development of this package was supported in part by NSF IIS-0948893 and R21HG005912 from the National Human Genome Research Institute. Hahsler and Dunham (2010) <doi:10.18637/jss.v035.i05>.
Last updated 4 months ago
clusteringdata-streamsequence-analysis
4.49 score 1 stars 31 scripts 752 downloadsrRDP - Interface to the RDP Classifier
This package installs and interfaces the naive Bayesian classifier for 16S rRNA sequences developed by the Ribosomal Database Project (RDP). With this package the classifier trained with the standard training set can be used or a custom classifier can be trained.
Last updated 2 months ago
geneticssequencinginfrastructureclassificationmicrobiomeimmunooncologyalignmentsequencematchingdataimportbayesianbioconductorbioinformaticsopenjdk
4.48 score 1 stars 3 scripts 210 downloadsstreamConnect - Connecting Stream Mining Components Using Sockets and Web Services
Adds functionality to connect stream mining components from package stream using sockets and Web services. The package can be used create distributed workflows and create plumber-based Web services which can be deployed on most common cloud services.
Last updated 4 months ago
3.78 score 2 stars 1 scripts 59 downloadsrMSA - Interface for Popular Multiple Sequence Alignment Tools
Seamlessly interfaces the Multiple Sequence Alignment software packages ClustalW, MAFFT, MUSCLE and Kalign (downloaded separately) and provides support to calcualte distances between sequences. This work was partially supported by grant no. R21HG005912 from the National Human Genome Research Institute.
Last updated 7 months ago
geneticssequencinginfrastructurealignmentbioinformaticssequence-alignment
3.70 score 10 stars 7 scriptsarulesSequences - Mining Frequent Sequences
Add-on for arules to handle and mine frequent sequences. Provides interfaces to the C++ implementation of cSPADE by Mohammed J. Zaki.
Last updated 4 months ago
3.63 score 11 stars 105 scripts 1.2k downloadspomdpSolve - Interface to 'pomdp-solve' for Partially Observable Markov Decision Processes
Installs an updated version of 'pomdp-solve' and provides a low-level interface. Pomdp-solve is a program to solve Partially Observable Markov Decision Processes (POMDPs) using a variety of exact and approximate value iteration algorithms. A convenient R infrastructure is provided in the separate package pomdp. Kaelbling, Littman and Cassandra (1998) <doi:10.1016/S0004-3702(98)00023-X>.
Last updated 4 months ago
control-theorymarkov-decision-processesoptimization
3.48 score 1 stars 1 dependents 3 scripts 667 downloadsarulesNBMiner - Mining NB-Frequent Itemsets and NB-Precise Rules
NBMiner is an implementation of the model-based mining algorithm for mining NB-frequent itemsets and NB-precise rules. Michael Hahsler (2006) <doi:10.1007/s10618-005-0026-2>.
Last updated 2 years ago
association-rulesopenjdk
3.48 score 6 stars 10 scripts 303 downloadsrecommenderlabJester - Jester Dataset for 'recommenderlab'
Provides the Jester Dataset for package recommenderlab.
Last updated 3 years ago
recommender-systems
2.70 score 1 scripts 166 downloadsrecommenderlabBX - Book-Crossing Dataset (BX) for 'recommenderlab'
Provides the Book-Crossing Dataset for the package recommenderlab.
Last updated 3 years ago
recommender-systems
2.70 score 1 scripts 144 downloads