Generative AI and STEM

Background

Artificial intelligence is not new. It has been part of our personal and work lives for a long time (autocorrect, facial recognition, satnav, etc.) and large language models like ChatGPT have been a big topic in education since version 3.5 was released in late November, 2022. Large language models (LLMs) are trained on enormous amounts of data in order to recognize the patterns of and connections between words, and then produce text based on the probabilities of which word is most likely to come next. One thing that LLMs don’t do, however, is computation; however, the most recent Open AI release, GPT-4, seems to have made strides in standardized tests in many STEM areas and GPT-4 now has a plug-in for Wolfram Alpha, which does do computation.

Chart from Open AI: exam result improvements from ChatGPT 3.5 to 4

Andrew Roberts (Math Dept) and Susan Bonham (EdTech) did some testing to see how ChatGPT (3.5), GPT-4, and GPT-4 with the Wolfram plugin would handle some questions from Langara’s math courses.

Test Details

Full test results are available. (accessible version of the problems and full details of the “chats” and subsequent discussion for each AI response are available at the link)

The following questions were tested:

 

Problem 1: (supplied by Langara mathematics instructor Vijay Singh)

 

Problem 2: (Precalculus)

 

Problem 3: (Calculus I)

 

Problem 4: (Calculus II)

 

Discussion

Responses from current versions of ChatGPT are not reliable enough to be accepted uncritically.

ChatGPT needs to be approached as a tool and careful proof-reading of responses is needed to check for errors in computation or reasoning. Errors may be blatant and readily apparent, or subtle and hard to spot without close reading and a solid understanding of the concepts.

Perhaps the biggest danger for a student learning a subject is in the “plausibility” of many responses even when they are incorrect. ChatGPT will present its responses with full confidence in their correctness, whether this is justified or not.

When errors or lack of clarity is noticed in a response, further prompting needs to be used to correct and refine the initial response. This requires a certain amount of base knowledge on the part of the user in order to guide ChatGPT to the correct solution.

Algebraic computations cannot be trusted as ChatGPT does not “know” the rules of algebra but is simply appending steps based on a probabilistic machine-learning model that references the material on which it was trained. The quality of the answers will depend on the quality of the content on which ChatGPT was trained. There is no way for us to know exactly what training material ChatGPT is referencing when generating its responses. The average quality of solutions sourced online should give us pause.

Below is one especially concerning example of an error encountered during our testing sessions:

In the response to the optimization problem (Problem 3), GPT-3.5 attempts to differentiate the volume function:

However, the derivative is computed as:

We see that it has incorrectly differentiated the first term with respect to R while correctly differentiating the second term with respect to h.

It is the plausibility of the above solution (despite the bad error) that is dangerous for a student who may take the ChatGPT response at face value.

Access to the Wolfram plugin in GPT-4 should mean that algebraic computations that occur within requests sent to Wolfram can be trusted. But the issues of errors in reasoning and interpretation still exist between requests sent to Wolfram.

Concluding Thought

It will be important for us educate our students about the dangers involved in using this tool uncritically while acknowledging the potential benefits if used correctly.

Want to Learn More?

EdTech and TCDC run workshops on various AI topics. You can request a bespoke AI workshop tailored to your department or check out the EdTech and TCDC workshop offerings. For all other questions, please contact edtech@langara.ca

A.I. Detection: A Better Approach 

Over the past few months, EdTech has shared concerns about A.I. classifiers, such as Turnitin’s A.I. detection tool, AI Text Classifier, GPTZero, and ZeroGPT. Both in-house testing and statements from Turnitin and OpenAI confirm that A.I. text classifiers unreliably differentiate between A.I. and human generated writing. Given that the tools are unreliable and easy to manipulate, EdTech discourages their use. Instead, we suggest using Turnitin’s Similarity Report to help identify A.I.-hallucinated and fabricated references.  

What is Turnitin’s Similarity Report 

The Turnitin Similarity Report quantifies how similar a submitted work is to other pieces of writing, including works on the Internet and those stored in Turnitin’s extensive database, highlighting sections that match existing sources. The similarity score represents the percentage of writing that is similar to other works. 

AI Generated References 

A.I. researchers call the tendency of A.I. to make stuff up a “hallucination.” A.I.-generated responses can appear convincing, but include irrelevant, nonsensical, or factually incorrect answers.  

ChatGPT and other natural language processing programs do a poor job of referencing sources, and often fabricating plausible references. Because the references seem real, students often mistake them as legitimate. 

Common reference or citation errors include: 

  • Failure to include a Digital Object Identifier (DOI) or incorrect DOI 
  • Misidentification of source information, such as journal or book title 
  • Incorrect publication dates 
  • Incorrect author information 

Using Turnitin to Identify Hallucinated References 

To use Turnitin to identify hallucinated or fabricated references, do not exclude quotes and bibliographic material from the Similarity Report. Quotes and bibliographic information will be flagged as matching or highly similar to source-based evidence. Fabricated quotes, references, and bibliographic information will have zero similarity because they will not match source-based evidence.

Quotes and bibliographic information with no similarity to existing works should be investigated to confirm that they are fabricated.  

References

Athaluri S, Manthena S, Kesapragada V, et al. (2023). Exploring the boundaries of reality: Investigating the phenomenon of artificial intelligence hallucination in scientific writing through ChatGPT references. Cureus 15(4): e37432. doi:10.7759/cureus.37432 

Metz, C. (2023, March 29). What makes A.I. chatbots go wrong? The curious case of the hallucinating software. New York Times. https://www.nytimes.com/2023/03/29/technology/ai-chatbots-hallucinations.html 

Aligning language models to follow instructions. (2022, January 27). OpenAI. https://openai.com/research/instruction-following 

Weise, K., and Metz, C. (2023, May 1). What A.I. chatbots hallucinate. New York Times. https://www.nytimes.com/2023/05/01/business/ai-chatbots-hallucination.html 

Welborn, A. (2023, March 9). ChatGPT and fake citations. Duke University Libraries. https://blogs.library.duke.edu/blog/2023/03/09/chatgpt-and-fake-citations/ 

screenshot of a Turnitin Similarity Report, with submitted text on the left and the report panel on the right

AI Classifiers — What’s the problem with detection tools?

AI classifiers don’t work!

Natural language processor AIs are meant to be convincing. They are creating content that “sounds plausible because it’s all derived from things that humans have said” (Marcus, 2023). The intent is to produce outputs that mimic human writing. The result: The world’s leading AI companies can’t reliably distinguish the products of their own machines from the work of humans.

In January, OpenAI released its own AI text classifier. According to OpenAI “Our classifier is not fully reliable. In our evaluations on a “challenge set” of English texts, our classifier correctly identifies 26% of AI-written text (true positives) as “likely AI-written,” while incorrectly labeling human-written text as AI-written 9% of the time (false positives).”

A bit about how AI classifiers identify AI-generated content

GPTZero, a commonly used detection tool, identifies AI created works based on two factors: perplexity and burstiness.

Perplexity measures the complexity of text. Classifiers identify text that is predictable and lacking complexity as AI-generated and highly complex text as human-generated.

Burstiness compares variation between sentences. It measures how predictable a piece of content is by the homogeneity of the length and structure of sentences throughout the text. Human writing tends to be variable, switching between long and complex sentences and short, simpler ones. AI sentences tend to be more uniform with less creative variability.

The lower the perplexity and burstiness score, the more likely it is that text is AI generated.

Turnitin is a plagiarism-prevention tool that helps check the originality of student writing. On April 4th, Turnitin released an AI-detection feature.

According to Turnitin, its detection tool works a bit differently.

When a paper is submitted to Turnitin, the submission is first broken into segments of text that are roughly a few hundred words (about five to ten sentences). Those segments are then overlapped with each other to capture each sentence in context.

The segments are run against our AI detection model, and we give each sentence a score between 0 and 1 to determine whether it is written by a human or by AI. If our model determines that a sentence was not generated by AI, it will receive a score of 0. If it determines the entirety of the sentence was generated by AI it will receive a score of 1.

Using the average scores of all the segments within the document, the model then generates an overall prediction of how much text (with 98% confidence based on data that was collected and verified in our AI innovation lab) in the submission we believe has been generated by AI. For example, when we say that 40% of the overall text has been AI-generated, we’re 98% confident that is the case.

Currently, Turnitin’s AI writing detection model is trained to detect content from the GPT-3 and GPT-3.5 language models, which includes ChatGPT. Because the writing characteristics of GPT-4 are consistent with earlier model versions, our detector is able to detect content from GPT-4 (ChatGPT Plus) most of the time. We are actively working on expanding our model to enable us to better detect content from other AI language models.

The Issues

AI detectors cannot prove conclusively if text is AI generated. With minimal editing, AI-generated content evades detection.

L2 writers tend to write with less “burstiness.” Concern about bias is one of the reasons for UBC chose not to enable Turnitins’ AI-detection feature.

ChatGPT’s writing style may be less easy to spot than some think.

Privacy violations are a concern with both generators and detectors as both collect data.

Now what?

Langara’s EdTech, TCDC, and SCAI departments are working together to offer workshops on four potential approaches: Embrace it, Neutralize it, Ban it, Ignore it. Interested in a bespoke workshop for your department? Complete the request form.


References
Marcus, G. (2023, January 6). Ezra Klein interviews Gary Marcus [Audio podcast episode]. In The Ezra Klein Show. https://www.nytimes.com/2023/01/06/podcasts/transcript-ezra-klein-interviews-gary-marcus.html

Fowler, G.A. (2023, April 3). We tested a new ChatGPT-detector for teachers. If flagged an innocent student. Washington Post. https://www.washingtonpost.com/technology/2023/04/01/chatgpt-cheating-detection-Turnitin/

AI Detection Tool Testing — Initial Results

We’ve limited our testing to Turnitin’s AI detection tool. Why? Turnitin has undergone privacy and risk reviews and is a college-approved technology. Other detection tools haven’t been reviewed and may not meet recommended data privacy standards.

What We’ve Learned So Far

  • Unedited AI-generated content often receives a 100% AI-generated score, although more complex writing by ChatGPT4 can score far less than 100%.
  • Adding typos and grammar mistakes or prompting the AI generator to include errors throughout a document canchange the AI-generated score from 100% to 0%. 
  • Adding I-statements throughout a document has a dramatic impact in lowering the AI score. 
  • Interrupting the flow of text by replacing one word every couple of sentences with a less likely word, increases the perplexity of the wording and lowers the AI-generated percentage.  AI text generators act like text predictors, creating text by adding the most likely next work. If the detector is perplexed by a word because the word is not the most likely choice, then it’s determined to be human written.
  • Unlike human-generated writing, AI sentences tend to be uniform. Changing the length of sentences throughout a document, making some sentences shorter and others longer and more complex, alters the burstiness and lowers the generated-by-AI score. 
  • By replacing one or two words per paragraph and modifying the length of sentences here and there throughout a chunk of text — i.e. by doing minor tweaks of both perplexity and burstiness — the AI-generated score changes from 100% to 0%. 

To learn more about how AI detection tools work, read AI Classifiers — What’s the problem with detection tools?

AI tools & privacy

ChatGPT is underpinned by a large language model that requires massive amounts of data to function and improve. The more data the model is trained on, the better it gets at detecting patterns, anticipating what will come next and generating plausible text.

Uri Gal notes the following privacy concerns in The Conversation:

  • None of us were asked whether OpenAI could use our data. This is a clear violation of privacy, especially when data are sensitive and can be used to identify us, our family members, or our location.
  • Even when data are publicly available their use can breach what we call contextual integrity. This is a fundamental principle in legal discussions of privacy. It requires that individuals’ information is not revealed outside of the context in which it was originally produced.
  • OpenAI offers no procedures for individuals to check whether the company stores their personal information, or to request it be deleted. This is a guaranteed right in accordance with the European General Data Protection Regulation (GDPR) – although it’s still under debate whether ChatGPT is compliant with GDPR requirements.
  • This “right to be forgotten” is particularly important in cases where the information is inaccurate or misleading, which seems to be a regular occurrencewith ChatGPT.
  • Moreover, the scraped data ChatGPT was trained on can be proprietary or copyrighted.

When we use AI tools, including detection tools, we are feeding data into these systems. It is important that we understand our obligations and risks.

When an assignment is submitted to Turnitin, the student’s work is saved as part of Turnitin’s database of more than 1 billion student papers. This raises privacy concerns that include:

  • Students’ inability to remove their work from the database
  • The indefinite length of time that papers are stored
  • Access to the content of the papers, especially personal data or sensitive content, including potential security breaches of the server

AI detection tools, including Turnitin, should not be used without students’ knowledge and consent. While Turnitin is a college-approved tool, using it without students’ consent poses a copyright risk (Strawczynski, 2004).  Other AI detection tools have not undergone privacy and risk assessments by our institution and present potential data privacy and copyright risks.

For more information, see our Guidelines for Using Turnitin.

Getting Started with ChatGPT

Tips for writing effective prompts

Prompt-crafting takes practice:

  • Focus on tasks where you are an expert & get GPT to help.
  • Give the AI context.
  • Give it step-by-step directions.
  • Get an initial answer. Ask for changes and edits.

Provide as much context as possible and use specific and detailed language. You can include information about:

  • Your desired focus, format, style, intended audience and text length.
  • A list of points you want addressed.
  • What perspective you want the text written from, if applicable.
  • Specific requirements, such as no jargon.

Try an iterative approach

Ethan Mollick offers the following:

  • The best way to use AI systems is not to craft the perfect prompt, but rather to use it interactively. Try asking for something. Then ask the AI to modify or adjust its output. Work with the AI, rather than trying to issue a single command that does everything you want. The more you experiment, the better off you are. Just use the AI a lot, and it will make a big difference – a lesson my class learned as they worked with the AI to create essays.
  • More elaborate and specific prompts work better.
  • Don’t ask it to write an essay about how human error causes catastrophes. The AI will come up with a boring and straightforward piece that does the minimum possible to satisfy your simple demand. Instead, remember you are the expert, and the AI is a tool to help you write. You should push it in the direction you want. For example, provide clear bullet points to your argument: write an essay with the following points: -Humans are prone to error -Most errors are not that important -In complex systems, some errors are catastrophic -Catastrophes cannot be avoided.
  • But even these results are much less interesting than a more complicated prompt: write an essay with the following points. use an academic tone. use at least one clear example. make it concise. write for a well-informed audience. use a style like the New Yorker. make it at least 7 paragraphs. vary the language in each one. end with an ominous note. -Humans are prone to error -Most errors are not that important -In complex systems, some errors are catastrophic -Catastrophes cannot be avoided
  • Try asking for it to be conciseor wordy or detailed, or ask it to be specific or to give examples. Ask it to write in a tone (ominous, academic, straightforward) or to a particular audience (professional, student) or in the style of a particular author or publication (New York Times, tabloid news, academic journal). You are not going to get perfect results, so experimenting (and using the little “regenerate response” button) will help you get to the right place. Over time, you will start to learn the “language” that ChatGPT is using.

Get ChatGPT to ask you questions

Instead of coming up with your own prompts, try getting the AI to ask you questions to get the information it needs. In a recent Twitter post, Ethan Mollick notes that this approach produced surprisingly good results.

Ideas for using ChatGPT with students

For lots of great ideas and advice, watch Unlocking the Power of AI: How Tools Like ChatGPT Can Make Teaching Easier and More Effective.

  • Use it to create counterarguments to students work. Students can use the AI output to further refine their arguments and help them clarify their positions.
  • Use it to write something for different audiences and have students compare the output and identify how writing changes for a general versus expert audience.
  • Use ChatGPT for a first draft and then have students edit a second draft with critiques, corrections, and additions.
  • Use it to start a discussion. For example, ask ChatGPT why one theory is better than another. Then, ask again why the second theory is better.
  • Use it to generate a list of common misconceptions and then have students address them.
  • Ask students to generate a ChatGPT response to a question of their own choosing, and then write an analysis of the strengths and weaknesses of the ChatGPT response.

Some ways you can use ChatGPT

  • Use it to create a bank of multiple choice and short-answer questions for formative assessment. It can also pre-generate sample responses and feedback.
  • Use it to create examples.
  • Use it to generate ten prompts for a class discussion.

Further reading and resources

Heaven, W.D. (2023, April 6). ChatGPT is going to change education, not destroy it. MIT Technology Review.

Liu, D. et al (2023). How AI can be used meaningfully by teachers and students in 2023. Teaching@Sydney.

Mollick, E. R., & Mollick, L. (2022). New modes of learning enabled by AI Chatbots: Three methods and assignments. SSRN Electronic Journal. https://doi.org/10.2139/ssrn.4300783

Rudolph, J. et al (2023). ChatGPT: Bullshit spewer or the end of traditional assessments in higher education? Journal of Applied Learning and Teaching. Vol. 6, No. 1.

ETUG Spring Workshop 2023

The Educational Technology Users Group (ETUG) is a community of BC post-secondary educators focused on the ways in which learning and teaching can be enhanced through technology. ETUG’s mission is to support and nurture a vibrant, innovative, evolving, and supportive community that thrives with the collegial sharing of ideas, resources, and ongoing professional development through face-to-face workshops and online activities.

Spring Workshop

This two-day online and in-person workshop will showcase how instructors, education developers, and education technologists are approaching design. For example, we’ll explore how digital literacy, inclusive technology, and AI could be “baked” into courses and how instructors are supported in making design decisions around technology. We’ll also consider the ongoing communication and capacity-building at institutions around digital literacy, accessibility, and AI, such as how teaching and learning centres and libraries get the word out to instructors and students about new approaches and resources.

Join ETUG online in Zoom or in-person at Kwantlen Polytechnic University Lansdowne Road Campus in Richmond, B.C. for this 2-day hybrid event, sponsored by BCcampus.

  • Day 1: June 1, 2023: 9:00 AM to 4:30 PM Pacific Time
  • Day 2: June 2, 2023: 9:00 AM to 4:00 PM Pacific Time

Workshop Rates: 2-day Registration Only

  • Early Bird In-Person: $175 CAD + 5% GST (ends April 29 at 11:59 PM)
  • Regular Rate In-Person: $200 CAD + 5% GST
  • Online: $150 CAD + 5% GST
  • Students: Free

Childcare

Attending an event sometimes means choosing professional development at the expense of spending time with family, but for large, multi-day events hosted by BCcampus, participants do not have to choose one over the other. Please let us know when you register if you will require childcare. You can read more about our childcare program and provider here Childcare Program Information

In order to secure nannies in time, our childcare registration cut-off date is May 15, 2023. 

Registration

Register online to attend the ETUG Spring Workshop

 

Fostering Learner Engagement with ePortfolios

Fostering Learner Engagement with ePortfolios

EdTech is pleased to welcome Dr. Gail Ring and Dr. Melissa Shaquid Pirie Cross to campus to share their expertise on ePortfolios on November 21st from 11:00-12:00 (in person and online).

Registration Information

Here’s what they’ve shared about their presentation:

True learning ePortfolios provide students with multiple opportunities to revisit and reconsider the evidence of their learning experiences and present that learning to an external audience. As ePortfolio practitioners and evangelists, we have long believed in the power of ePortfolios to facilitate student learning, agency and engagement. We also understand that the practices of folio thinking, and the benefits that can be achieved by those practices, often requires a pedagogical shift from both faculty and students.

In this presentation we will share stories that demonstrate how portfolios can contribute a more learner-centered, process-oriented approach to teaching and learning supporting:

  • Reflection by giving students an opportunity to pause and reflect on their accomplishments, which often reveals new learning that can contribute to the development of their professional and digital identities.
  • Integrative learning over time, across contexts, and with intention (Patton and Reynolds, 2014) through Portfolio development and folio thinking practices.
  • Engagement of faculty in professional development applications and uses that lead to the integration of portfolios into instruction and assessment throughout the curriculum.

The result of these efforts include reflective, evidence-rich portfolios that have future value for both students and the university to showcase learning successes throughout/across the learning journey.

We will share a variety of examples that encompass everything from preparation for university to preparation for career. The examples presented will demonstrate holistic learning and lifelong folio participation practices.

Bios: 

Dr. Gail Ring, Director of Service and Partnerships for PebblePad, North America

Gail has had an extensive career in higher education. In addition to her work as an educator, she has founded and directed a number of teaching and learning centers. Formerly, she was the Director, Portfolio Program, Clemson University. For more information about Dr. Ring, including her research and publications, please see her professional portfolio.

Dr. Melissa Shaquid Pirie Cross, Implementation Specialist for PebblePad, North America

In addition to being an educator, Melissa has had roles as a public relations and retention specialist, a coordinator of dual enrollment programs, a director of student and academic services, and a faculty training and development coordinator in several community college and public universities. She has taught with portfolios extensively at Portland State University and is passionate about sharing her expertise with folio pedagogy.

Captions are now automatic on all new Kaltura media

New media content added to Kaltura MediaSpace will be automatically captioned, whether uploaded via the Langara MediaSpace website at https://mediaspace.langara.ca, or via My Tools > My Media in Brightspace. These captions are machine-generated and should be available within 30 minutes of uploading your file. All media, including screen recordings, file uploads, web recordings, and most YouTube imports, will have captions added when uploaded to Kaltura. These are closed captions that can be deactivated by the media owner and when available, toggled on and off by the viewer. Existing media—uploaded before October 18th, 2022—will not have captions automatically added but you can request captions for this media.

Keep in mind, machine-generated captions are only 85% accurate and will not meet the requirements of students with closed captioning accommodations. Students requiring an accommodation will contact Accessibility Services, who will inform you directly. If you have a student that requires closed captions, edit your captions to ensure they are 99% accurate or contact Langara’s Assistive Technologist to request assistance with human-edited closed captions. 

We developed a Closed Captions slideshow (below) to provide step-by-step instructions for all you need to know about captioning your media in Kaltura MediaSpace/My Media.