
Title: Can GPT Alleviate the Burden of Annotation?
Speaker: Morgan Gray, ISP PhD Student
Abstract: Manual annotation is just as burdensome as it is necessary for some legal text analytic tasks. Given the promising performance of Generative Pretrained Transformers (GPT) on a number of different tasks in the legal domain, it is natural to ask if it can help with text annotation, or even replace manual annotation. Here we report a series of experiments using GPT-4 and GPT 3.5 as a pre-annotation tool to determine whether a sentence in a legal opinion describes a legal factor. These GPT models assign labels that human annotators subsequently confirm or reject. To assess the utility of pre-annotating sentences at scale, we examine the agreement among gold-standard annotations, GPT’s pre-annotations, and law students’ annotations. The agreements among these groups support that using GPT-4 as a pre-annotation tool is a useful starting point for large-scale annotation of factors.
Title: Investigating the Role of Attribute Context in Vision-Language Models for Object Recognition and Detection
Speaker: Kyle Buettner, ISP PhD Student
Abstract: Vision-language alignment learned from image-caption pairs has been shown to benefit tasks like object recognition and detection. Methods are mostly evaluated in terms of how well object class names are learned, but captions also contain rich attribute context that should be considered when learning object alignment. It is unclear how methods use this context in learning, as well as whether models succeed when tasks require attribute and object understanding. To address this gap, we conduct an extensive analysis of the role of attributes in vision-language models. We specifically measure model sensitivity to the presence and meaning of attribute context, gauging influence on object embeddings through unsupervised phrase grounding and classification via description methods. We further evaluate the utility of attribute context in training for open-vocabulary object detection and fine-grained text-region retrieval and attribution tasks. Our results highlight that attribute context can be wasted when learning alignment for detection, attribute meaning is not adequately considered in embeddings, and attribute-based class descriptions are ineffective. A viable strategy that we find to increase benefits from attributes is contrastive training with adjective-based negative captions.
Friday, December 1 at 12:30 p.m. to 1:30 p.m.
Sennott Square, 5317
210 South Bouquet Street, Pittsburgh, PA 15260
Title: Can GPT Alleviate the Burden of Annotation?
Speaker: Morgan Gray, ISP PhD Student
Abstract: Manual annotation is just as burdensome as it is necessary for some legal text analytic tasks. Given the promising performance of Generative Pretrained Transformers (GPT) on a number of different tasks in the legal domain, it is natural to ask if it can help with text annotation, or even replace manual annotation. Here we report a series of experiments using GPT-4 and GPT 3.5 as a pre-annotation tool to determine whether a sentence in a legal opinion describes a legal factor. These GPT models assign labels that human annotators subsequently confirm or reject. To assess the utility of pre-annotating sentences at scale, we examine the agreement among gold-standard annotations, GPT’s pre-annotations, and law students’ annotations. The agreements among these groups support that using GPT-4 as a pre-annotation tool is a useful starting point for large-scale annotation of factors.
Title: Investigating the Role of Attribute Context in Vision-Language Models for Object Recognition and Detection
Speaker: Kyle Buettner, ISP PhD Student
Abstract: Vision-language alignment learned from image-caption pairs has been shown to benefit tasks like object recognition and detection. Methods are mostly evaluated in terms of how well object class names are learned, but captions also contain rich attribute context that should be considered when learning object alignment. It is unclear how methods use this context in learning, as well as whether models succeed when tasks require attribute and object understanding. To address this gap, we conduct an extensive analysis of the role of attributes in vision-language models. We specifically measure model sensitivity to the presence and meaning of attribute context, gauging influence on object embeddings through unsupervised phrase grounding and classification via description methods. We further evaluate the utility of attribute context in training for open-vocabulary object detection and fine-grained text-region retrieval and attribution tasks. Our results highlight that attribute context can be wasted when learning alignment for detection, attribute meaning is not adequately considered in embeddings, and attribute-based class descriptions are ineffective. A viable strategy that we find to increase benefits from attributes is contrastive training with adjective-based negative captions.
Friday, December 1 at 12:30 p.m. to 1:30 p.m.
Sennott Square, 5317
210 South Bouquet Street, Pittsburgh, PA 15260