Our system's scalability effortlessly accommodates vast image repositories, enabling precise crowd-sourced localization across a substantial scale. Our contribution to COLMAP, a prominent Structure-from-Motion software, is a publicly available add-on found at https://github.com/cvg/pixel-perfect-sfm.
3D animators have lately shown increased interest in how artificial intelligence can be used in choreographic design. Current deep learning methods for dance generation are largely dependent on music, which often results in a lack of fine-grained control over the generated dance motions. This issue is addressed by introducing keyframe interpolation for music-driven dance generation, coupled with a novel choreography transition technique. To learn the probability distribution of dance motions, this technique uses normalizing flows, and by doing so, synthesizes diverse and plausible dance movements based on music and a limited set of key poses. Subsequently, the produced dance movements harmonize with the musical timing and the predefined poses. For a secure and adaptable transition of diverse durations across the key postures, a time embedding is introduced for each moment in time as an additional constraint. Extensive testing showcases the superior realistic, diverse, and beat-matching dance motions generated by our model, surpassing the performance of the current leading-edge techniques in both qualitative and quantitative assessments. Our experimental analysis highlights the superior performance of keyframe-based control in diversifying generated dance motions.
Spiking Neural Networks (SNNs) employ discrete spikes to represent and propagate information. In consequence, the translation of spiking signals to real-valued signals is of high significance in shaping the encoding efficiency and performance of SNNs, typically executed through spike encoding algorithms. To choose the right spike encoding algorithms for various spiking neural networks, this study examines four prevalent algorithms. Results from FPGA algorithm implementations, covering calculation speed, resource consumption, precision, and noise immunity, are crucial for assessing suitability for neuromorphic SNN implementation. To validate the evaluation outcomes, two practical applications are similarly employed. By comparing and analyzing evaluation data, this study categorizes and describes the attributes and application areas of various algorithms. Broadly speaking, the sliding window algorithm, though possessing relatively low accuracy, is well-suited for the task of identifying signal trends. Medium Frequency Algorithms based on pulsewidth modulation and the step-forward method offer accurate reconstruction for diverse signals, excluding square waves; Ben's Spiker algorithm compensates for this deficiency. For the purpose of selecting spiking coding algorithms, a scoring method is developed, facilitating improved encoding efficiency in neuromorphic spiking neural networks.
Image restoration, crucial for various computer vision applications, has drawn substantial attention under adverse weather conditions. Recent successful methods derive their efficacy from the present-day advancements in deep neural network architecture, including, for instance, vision transformers. Driven by the advancements in state-of-the-art conditional generative models, we introduce a novel patch-based image restoration method leveraging denoising diffusion probabilistic models. Size-agnostic image restoration is enabled by our patch-based diffusion modeling technique. This approach employs a guided denoising process, smoothing noise estimates across overlapping patches during the inference procedure. Benchmark datasets for image desnowing, combined deraining and dehazing, and raindrop removal are employed to empirically evaluate our model's performance. We exemplify our strategy for attaining leading performance in weather-specific and multi-weather image restoration tasks and showcase the substantial generalization power on real-world test datasets.
In numerous applications involving dynamic environments, the methods of data acquisition have evolved, leading to incremental data attributes and the progressive accumulation of feature spaces within stored samples. The growing diversity of testing methods in neuroimaging-based neuropsychiatric diagnoses directly correlates with the expansion of available brain image features over time. The unavoidable consequence of various features in high-dimensional data is the difficulty in manipulation and management. buy CX-4945 Selecting valuable features in this incremental feature environment poses a significant algorithmic design challenge. Motivated by the need to understand this critical yet under-explored problem, we develop a novel Adaptive Feature Selection method (AFS). The trained feature selection model's capability for reuse is combined with automatic adaptation to the feature selection criteria across all features, which was previously trained on a subset of features. To further this point, an ideal l0-norm sparse constraint is imposed on feature selection using a proposed effective solving strategy. We present theoretical analyses that delineate the connection between generalization bounds and convergence behavior. Beginning with a single example, we extend our analysis and solution to accommodate multiple iterations of this problem. Extensive experimental data underscores the effectiveness of reusing prior features and the superior advantages of the L0-norm constraint in a wide array of circumstances, alongside its remarkable proficiency in discriminating schizophrenic patients from healthy controls.
Accuracy and speed frequently emerge as the most important criteria for the evaluation of numerous object tracking algorithms. Despite the advantages of employing deep network feature tracking, tracking drift emerges when constructing a deep fully convolutional neural network (CNN). This is attributable to the effects of convolution padding, the receptive field (RF), and the network's overall step size. The tracker's swiftness will also lessen. This article describes a fully convolutional Siamese network object tracking algorithm which incorporates both attention mechanisms and feature pyramid networks (FPN). Computational efficiency is improved by using heterogeneous convolution kernels, reducing floating-point operations (FLOPs) and parameter count. breathing meditation To start, the tracker employs a novel fully convolutional neural network (CNN) to extract image features. The incorporation of a channel attention mechanism in the feature extraction process aims to augment the representational abilities of the convolutional features. The FPN is used to combine the convolutional features from high and low layers; then the similarity of the combined features is determined, and the CNNs are subsequently trained. For enhanced algorithm speed, a heterogeneous convolutional kernel is implemented, replacing the traditional convolutional kernel and rectifying the efficiency loss brought about by the use of the feature pyramid. In this paper, the tracker is experimentally verified and its performance analyzed on the VOT-2017, VOT-2018, OTB-2013, and OTB-2015 datasets. Analysis of the results reveals that our tracker has outperformed all other state-of-the-art trackers.
Medical image segmentation tasks have seen a significant boost in performance thanks to convolutional neural networks (CNNs). However, the large parameter count associated with CNNs creates deployment issues on devices with limited computational capabilities, such as embedded systems and mobile devices. Though some models with small memory footprints have been noted, most of them, it seems, lead to a decline in segmentation accuracy metrics. We propose a shape-oriented ultralight network (SGU-Net) with extraordinarily low computational costs as a solution to this issue. A notable contribution of SGU-Net is a novel lightweight convolution, allowing the concurrent execution of asymmetric and depthwise separable convolutions. By leveraging the ultralight convolution, the proposed methodology not only decreases the number of parameters but also enhances the resilience of the SGU-Net. Our SGUNet, a further development, employs an extra adversarial shape constraint to allow the network to learn the shape representation of the targets. This significantly elevates the segmentation accuracy for medical images of the abdomen using self-supervision. The SGU-Net was put through rigorous testing across four public benchmark datasets, LiTS, CHAOS, NIH-TCIA, and 3Dircbdb. SGU-Net, as evidenced by experimental results, possesses superior segmentation accuracy using fewer memory resources, thus achieving better performance than the leading networks currently in use. Subsequently, our ultralight convolution is employed in a 3D volume segmentation network, showing comparable performance, while also decreasing the parameter count and memory footprint. The SGUNet code, readily accessible, can be found on the GitHub repository at https//github.com/SUST-reynole/SGUNet.
Cardiac image segmentation has been revolutionized by the success of deep learning-based approaches. Despite the demonstrated segmentation efficacy, it remains constrained by considerable variations across diverse image domains, a phenomenon often described as domain shift. By training a model to reduce the gap in a common latent feature space, unsupervised domain adaptation (UDA) tackles this effect by aligning the labeled source and unlabeled target domains. This research introduces a novel framework, Partial Unbalanced Feature Transport (PUFT), to address the challenge of cross-modality cardiac image segmentation. Employing two Continuous Normalizing Flow-based Variational Auto-Encoders (CNF-VAE) and a Partial Unbalanced Optimal Transport (PUOT) strategy, our model system implements UDA. Rather than relying on parameterized variational approximations for latent features from different domains in prior VAE-based UDA works, we propose incorporating continuous normalizing flows (CNFs) into a broader VAE model to generate a more accurate probabilistic posterior, which then reduces inference bias.