About OpenKnowledge@NAU | For NAU Authors

Sub-quadratic area-under-the-curve optimization

Rust, Kyle (2022) Sub-quadratic area-under-the-curve optimization. Masters thesis, Northern Arizona University.

[thumbnail of Rust_2022_sub-quadratic_area-under-the-curve_optimization.pdf] Text
Rust_2022_sub-quadratic_area-under-the-curve_optimization.pdf - Published Version
Restricted to Repository staff only

Download (1MB) | Request a copy

Abstract

The Receiver Operating Characteristic (ROC) curve is a plot of the false positive rate(FPR) versus the true positive rate (TPR) that is often used for evaluating the performance of binary classification models. Maximizing the Area Under the Curve (AUC) has been shown to be beneficial when the difference between the number of positive and negative labels is drastic. Because the AUC is is a piece-wise constant function, learning algorithms instead optimize convex relaxations that sum over all of the pairs of labeled positive and negative examples. The na ̈ıve approach to summing over all of these pairs of examples takes quadratic time which quickly becomes unfeasible for large batch sizes. In this document, the necessary background to understand the problem, a summary of related approaches to solving this problem, and proposals for computing the square and squared-hinge loss in either liner or log-linear time are presented. Finally, a empirical study that shows the improved time complexity of these algorithms, shows there are situations where it is advantageous to make use of larger batch sizes, and that it is possible to achieve improved model performance using these proposed methods.

Item Type: Thesis (Masters)
Publisher’s Statement: © Copyright is held by the author. Digital access to this material is made possible by the Cline Library, Northern Arizona University. Further transmission, reproduction or presentation of protected items is prohibited except with permission of the author.
Keywords: Area Under the Curve; Deep Learning; Reciever Operating Characteristic Curve; Square Loss; Squared Hinge Loss; Unbalaced Binary Classification
Subjects: Q Science > QA Mathematics
Q Science > QA Mathematics > QA76 Computer software
NAU Depositing Author Academic Status: Student
Department/Unit: Graduate College > Theses and Dissertations
College of Engineering, Informatics, and Applied Sciences > School of Informatics, Computing, and Cyber Systems
Date Deposited: 14 Jun 2023 16:10
Last Modified: 14 Jun 2023 16:10
URI: https://openknowledge.nau.edu/id/eprint/6017

Actions (login required)

IR Staff Record View IR Staff Record View

Downloads

Downloads per month over past year