Question

Credit card fraud is fraud perpetrated through stolen credit cards or credit card information. For years, credit card issuersCreait card fraud is fraud perpetrated through stolen credt cards or credit card information. For years, credit card issuers

Credit card fraud is fraud perpetrated through stolen credit cards or credit card information. For years, credit card issuers have been using data mining and statistical tools to detect fraud. Citibank reported that knowing the type of product or service bought, frequency of purchases, and size and location of transaction can significantly reduce fraud. (Source: Jesus Mena, Investigative Data Mining for Security and Criminal Detection, Butterworth-Heinemann, pp. 250-251) A data-mining analyst at a major credit card company would like to construct and test a simple logistic regression model for detecting credit card fraud using data on card transections classified as either fraudulent or non-fraudulent. The dependent variable for the model is y1fthe transaction is due to credit card fraud; O if the transaction is not due to credit card fraud The independent variables for the model are chosen from the following x1 dollar amount of the transaction xnumber of transactions in the preceding 12 hours x31 If the Standard Industry Code (SIC) for the product or service bought never appeared in the card owner's transection history; 0 if atherwise x4 if the ZIP code of the transaction never appeared in the card owner's transaction history; 0 if otherwise The analyst would like to test a logistic regression model that predicts credit card fraud using the dollar amount of the transaction, the number of transactions in the preceding 12 hours, and the indicator variable for whether the ZIP code of the transaction never appeared in the card owner's transaction history. (Note: Actual fraud-detection models used by credit card companies are much more complicated than the ebove, induding up to hundreds of independent variables.) The logistic regression equation for the above model is: f x4-1, the mean of the dependent variable y is the code of the transaction when the ZIP that the transaction is appeered in the card owner's transaction history before, for given values of x1 and x2 Use the Distributions tool below to help you answer some of the questions that follow Select a Distribution Distributions The analyst uses computer software to estimate the parameters of the logistic regression model. The G statistic for the test of overall significance aconclude degrees of freedom. If the value of the G test statistic is 10.25, its p-value is At significance level α-.05, you conclude that the averall model is significant The estimated coefficients of the model are shown below Standard Error Predictor Constant Dollar amount of the transaction Number of preceding transactions ZIP code appeared before 0.0297 0.0045 0.2701 0.1645 0.0301 0.0022 0.1421 0.0632 The z statistic for the test of the significance of the independent variable x1 is and has a p-value of You conclude tatxī is significant at significance level α-.05. The estimated logit for the regression model is: O gxL, X2, x4)log(0.02970.0045x10.2701x2 0.1645x4) O g(x, X, x4)ex(o.02970.0045x10.2701D.1645xa) O g(x1.*2, X4)一0.0297 + 0.0045x1 + 0.2701x2 + 0.1645x4 O g(X1, x2, x4) exp(0.02970.0045x10.2701x2 0.1645xa)[exp(0.0297+0.0045x10.270120.1645x4)
Creait card fraud is fraud perpetrated through stolen credt cards or credit card information. For years, credit card issuers have been using dsta mining and statistical tools to detect fraud. cRibenk reported that knowing the type of product or service bought, frequency of purchases, and size and lacation uf transaction can signfartly reduce fraud. (Source: Jesus Mena·Investigative Duta Mining for Security and Crannal A dota-mining anelyst at a major credit card compeny would like to construct and test a simple logistic regression model for detecting credt card fraud using dsta an eard trarasctians classified s either fraudulent ar hon-fraudulent. The dependent variable for the madel is 1·the transaction is due to credit card fraud ; 0 the transaton is not due to credt card fraud y The independent variables for the model are chosen frorm the following K1 dollar ลmount of the transaction 2number of transactions in the preceding 12 haurs 均-1 if the Standard Industry Code(SIC) tor the product or serv ce bought never appeared in the card owner's transaction history: 0 if otherwise xs1 if the ZIP code of the transaction never appeared in the cerd owner's transaction history; 0 if otherwise The analyst would hke to test冫legstic regressin model that predicts credit card fraud using the dollar amount of the transaction, the number of transactions in the preceding 12 hours, and the indicator variable for whether the ZIP code of the transaction never appeared in the card owners transaction history. (Note: Actuol fraud-detection models used by Ơedit card compories are much more complicated than the above, including up to hundrads of independent variables. The logistic regression equation for the above medel is that the transact on 8 x 1, the meen of the dependent varieble y is the code of the transction when the ZIP - oppeannd in the card owner's transaction history be ote, far given valurs af xi and x. Use the Distributions tool below to help you answer some of the questions that folow, Distributions The analyst uses computer software to estimate the parameters of the logistic regression madel. The G statistic for the test of overal significance degrees of freedom. If the value of the G test statistic is 10.25, its p-value is Mt sgnficance level a05, you conclude thet the overell model is significant The estimated ecefficients of the madel are shawn below: Standard Error 0.02970.002 Dollar amount of the transaction Number of preceding transactions 21P cede appeared before 0.2701 The z statistic for the test of the signfcance of the independent variable x1 is and has a p-value of conclude that Ng i5 sigrincant at sipncance leve α-.05. The estimated lagt for the regression model is 。9(YL. K2.4)s exp(0.0297 + 0.0045x↓ + 0.270182 + 0.1645x4) O g(xL.x2『a) . exp(0.0297 , 0.004Sx1 0.2701x2 + 0.1 645x4) / [1 + exp(0.0297 + a0045xi + 0.2701#7 + 0.1645xd)! The adds ratio related to the coefficient P1 o the variable x1 is given by the odds that the transaction is fraudulent to obtain the odds of a fraudulent transaction when the dolar amount of the transaction increases by 1. The estimated value of this odds ratio is Suppoce the dollar amount of the transaction is $412 and the ZIP code of the produrt or service never appeared in the owner's transsction history, when the number of precedrng trasactions is 4, the estimated odds that the transaction is fraudulert is corresponding estimated probabiity is estimated odds is is independent variables. The corresponding estimeted probabilty can be conmputed by dividing the estimated odds by 1 plus the estimated odds. and the If the number of dollar amount of the transaction incresses by 1 unit to $413, the and the corresponding estimated probablity becomes The rotio of the second odds to the first (Hint:The estmated odds can be obtaned by taking the exponential of the estimated logit with the relevant values for the
0 0
Add a comment Improve this question Transcribed image text
Answer #1

SOLUTION

The Logistics Regression Equation of the above model is:

E(y) = exp(β0 + β1 x1 + β2 x2 + β4 x4) / [1+ exp(β0 + β1 x1 + β2 x2 + β4 x4)]

The z statistic for the test of the significance of the independent variable x4 is 2.60 ( = 0.1645/0.0632) and the pvalue is (.9953). Since Z > 2 at alpha = 0.05 , we conclude that x4 is significant at significance level = .05

The estimated logit for the regression model is:

G(x1,x2,x4) = exp(0.0297 + 0.0045 x1 + 0.2701 x2 + 0.1645 x4) / [1+ exp(0.0297 + 0.0045 x1 + 0.2701 x2 + 0.1645 x4)]

The odds ratio related to the coefficient β2 of the variable x2 is given by exp(β2 ). The estimated value of this odds ratio is 1.31.

Logit (p1) = exp(0.0297 + 0.0045 * 378 + 0.2701 * 5 + 0.1645 * 1) = 25.679

Probaility ( P) = p1/(1+p1) = 0.9625

Logit (p2) = exp(0.0297 + 0.0045 * 378 + 0.2701 * 6 + 0.1645 * 1) = 33.643

Probaility ( P) = p2/(1+p2) = 0.9711

Suppose the dollar amount is 378$, ZIP Code never occurred . When the number of preceding transactions is 5, estimated odds that the transaction is fraudulent is 25.679, and the corresponding estimated probability is 0.9625.

If the number to preceding transactions increases to 6 then, estimated odds is 33.643, and the estimated probability is 0.9711.

The ratio of second odds to first is 1.31.

Add a comment
Know the answer?
Add Answer to:
Credit card fraud is fraud perpetrated through stolen credit cards or credit card information. For years, credit card issuers have been using data mining and statistical tools to detect fraud. Citiba...
Your Answer:

Post as a guest

Your Name:

What's your source?

Earn Coins

Coins can be redeemed for fabulous gifts.

Not the answer you're looking for? Ask your own homework help question. Our experts will answer your question WITHIN MINUTES for Free.
Similar Homework Help Questions
ADVERTISEMENT
Free Homework Help App
Download From Google Play
Scan Your Homework
to Get Instant Free Answers
Need Online Homework Help?
Ask a Question
Get Answers For Free
Most questions answered within 3 hours.
ADVERTISEMENT
ADVERTISEMENT
ADVERTISEMENT